Agreement assessment

  • Agreement: the degree of similarity between measurements.

  • Depending on the context, agreement assessment is called by different names:

    • Measurements taken with the same measurement method: reliability, repeatability.

    • Measurements taken with different measurement methods: inter-rater reliability, reproducibility, concordance analysis, comparison data analysis, inter-rater agreement.

  • Methodology also depends on the type of data (nominal, ordinal, continuous)

Concordance analysis

  • A (continuous) characteristic is measured by several measurement methods.

  • Aim: to estimate the degree of agreement between the methods.

  • Example: Two devices (M1,M2) assessing the systolic blood pressure.

  • Measurements taken at the same time –> Same “true value”.

Concordance analysis

Assessment of concordance

Conditions for perfect concordance:

1) Equality of means \(\mu_1=\mu_2\).

m1

m2

s1

s2

r

120

120

24.4

31.6

0.83

Assessment of concordance

2) Perfect correlation \(\rho_{12}=1\).

m1

m2

s1

s2

r

120

120

24.4

29.5

1

Assessment of concordance

3) Equality of variances \(\sigma^{2}_{1}=\sigma^{2}_{2}\).

m1

m2

s1

s2

r

120

120

24.4

24.4

1

Concordance coefficient

  • Concordance coefficient (Lin, 1989):

\[\rho_{CCC}=\frac{2\cdot\rho_{12}\sigma_1\sigma_2}{\sigma^{2}_{1}+\sigma^{2}_{2}+(\mu_{1}-\mu_{2})^{2}}\]

  • Possible values between \(-1\) and \(1\).

  • Completely disagreement \(\rightarrow \rho_{CCC}=0\)

  • Perfect agreement \(\rightarrow \rho_{CCC}=1\)

  • Negative values:

    • Independence (actual parameter value is near 0).
    • Error in data.

Interpretation of Concordance coefficient

  • Landis and Koch (1977) criteria for Kappa also applies for CCC.

CCC

Interpretation

<0.2

Slight

[0.2,0.4)

Fair

[0.4,0.6)

Moderate

[0.6,0.8)

Substantial

>=0.8

Almost Perfect

  • Koo and Li (2016)

CCC

Interpretation

<0.5

Poor

[0.5,0.75)

Moderate

[0.75,0.9)

Good

>=0.9

Excellent

Inference

  • Confidence interval.

    • Based on Normal distribution.
    • Fisher’s Z transformation is commonly used (more important with small sample sizes).

    \[Z=\frac{1}{2}ln\left(\frac{1+\rho}{1-\rho}\right)\]

  • Hypothesis testing.

    • Is the CCC greater than a specific value? Commonly 0.8 or 0.9.
    • Are two CCCs different?

Concordance with repeated mesurements

  • Every subject is assessed more than one time by each method: replicates.

  • Measurements taken at the same time –> Same “true value”.

  • Lin’s estimator cannot be applied to such a design.

Concordance with repeated mesurements

  • Naive method: use the mean of the repeated measurements.

  • Only correct if taking the mean of the measurements is the actual measurement process.

  • Otherwise, the concordance is artificially increased –> Reduces within-subjects variability and measurement error.

  • Alternative: Carrasco and Jover (2003) demonstrated the equivalence between the CCC and the Intraclass Correlation Coefficient (ICC).

Intraclass correlation coefficient

  • ICC is based on the decomposition of the (outcome) variance into between and within-subjects variance components.

  • Between-subjects variance: \(\sigma_{\alpha}^2\)

  • Within-subjects variance (discordance components):

    • Between-methods variability: \(\sigma_{\beta}^2\)
    • Random error variance: \(\sigma_{e}^2\)
    • Also possible to add subjects-methods interaction: \(\sigma_{\alpha\beta}^2\)
  • Variance components are estimated using a linear mixed effects model.

\[\rho_{ICC}=\frac{\sigma_{\alpha}^2}{\sigma_{\alpha}^2+\sigma_{\beta}^2+\sigma_{\alpha\beta}^2+\sigma_{e}^2}\]

  • In this context, this ICC is known as CCC estimated by variance components.

Blood Pressure example

  • Blood pressure (BP) measured twice on 384 subjects using two devices

CCC estimate

  • cccrm R package (Carrasco et al., 2013).

  • First release (v1.0.0) in 2011. Authors: Josep L. Carrasco and Josep Puig.

  • Last release (v3.0.4) in Feb 2025: Authors: Josep L. Carrasco and Gonzalo Peón.

library(cccrm)

est<-ccc_vc(bpres,ry="SIS",rind="ID",rmet="METODE",int=TRUE)
summary(est)
         Subjects   Subjects-Method            Method             Error 
380.1874530093099   0.0000005623391   2.2953421378809  52.8673411494370 

CCC estimated by variance compoments 
       CCC  LL CI 95%  UL CI 95%     SE CCC 
0.87329122 0.85313329 0.89084538 0.00959477 

Longitudinal repeated mesurements

  • Suppose the repeated measurements are not taken at the same time.

  • The subject’s “true value” may change between measurements.

  • No point in evaluating the concordance between different times (M11 and M22, M12 and M21).

Longitudinal repeated mesurements

  • More variance components needed.

  • Between-subjects variance:

    • Subjects: \(\sigma_{\alpha}^2\)
    • Subjects-Time: \(\sigma_{\alpha\gamma}^2\).
  • Within-subjects variance (discordance components):

    • Methods-time: \(\sigma_{\beta\gamma}^2\)
    • Subjects-Methods: \(\sigma_{\alpha\beta}^2\)
    • Random error: \(\sigma_{e}^2\)

\[\rho_{ICC}=\frac{\sigma_{\alpha}^2+\sigma_{\alpha\gamma}^2}{\sigma_{\alpha}^2+\sigma_{\alpha\gamma}^2+\sigma_{\beta\gamma}^2+\sigma_{\alpha\beta}^2+\sigma_{e}^2}\]

CCC estimate

est<-ccc_vc(bpres,ry="SIS",rind="ID",rmet="METODE",rtime="NM")
summary(est)
       Subjects Subjects-Method   Subjects-Time          Method           Error 
      373.54730         3.69175        20.69585         2.27938        30.65388 

CCC estimated by variance compoments 
        CCC   LL CI 95%   UL CI 95%      SE CCC 
0.914997171 0.900184010 0.927695639 0.006993417 

CCC for longitudinal repeated mesurements

  • Assumption: the same level of agreement at all times.

CCC estimates by time

est_time<-ccc_est_by_time(bpres,ry="SIS",rind="ID",rmet="METODE",rtime="NM",
                plotit=TRUE,test=TRUE)
est_time$plot

est_time$ccc
  NM       CCC     LL95      UL95
1  1 0.9153748 0.897656 0.9301388
2  2 0.9145399 0.896657 0.9294434

CCC estimates by time: equality test

  • Test: \(\theta=b'\Sigma^{-1} b\) (Vanbelle S., 2017)
  • b: vector with CCC estimates
  • \(\Sigma\): variance-covariance matrix of CCC estimates.
  • \(\Sigma\) is estimated by non-parametric cluster bootstrap (500 resamples by default).
  • Under the null hypothesis of equality of CCCs, \(\theta\) follows a Chi-square distribution with \(t-1\) degrees of freedom; \(t\) = number of times.
est_time$res_test
        Chi.Sq DF    Pvalue
1 0.0004148078  1 0.9837507

Body fat data

  • Percentage body fat obtained from skinfold calipers and DEXA on a cohort of 90 adolescent girls.

  • Measurements were taken at ages 12.5, 13 and 13.5.

CCC estimate

est<-ccc_vc(bfat,ry="BF",rind="SUBJECT",rmet="MET",rtime="VISITNO")
summary(est)
       Subjects Subjects-Method   Subjects-Time          Method           Error 
      8.5920144       2.1082341       0.9204080       5.1576360       0.7697708 

CCC estimated by variance compoments 
       CCC  LL CI 95%  UL CI 95%     SE CCC 
0.54207819 0.43613835 0.63319749 0.05031125 

CCC estimates by time

est_time<-ccc_est_by_time(bfat,ry="BF",rind="SUBJECT",rmet="MET",rtime="VISITNO",
                     plotit=TRUE,test=TRUE)
est_time$plot

est_time$ccc
  VISITNO       CCC      LL95      UL95
1    12.5 0.6693741 0.5520280 0.7607205
2      13 0.4837805 0.3698900 0.5833467
3    13.5 0.4886355 0.3737142 0.5887816

CCC estimates by time: test

est_time$res_test
    Chi.Sq DF         Pvalue
1 25.94093  2 0.000002328088
est_time$pair_comp
       Difs     Estimate         SE         Adj.P
1   12.5-13  0.185593602 0.04819816 0.00023562237
2 12.5-13.5  0.180738623 0.04122848 0.00003498342
3   13-13.5 -0.004854979 0.05254474 0.92638257992

References

Carrasco, JL, Jover, L. (2003). Estimating the generalized concordance correlation coefficient through variance components. Biometrics, 59, 849-858.

Carrasco JL, Phillips BR, Puig-Martinez J, King TS, Chinchilli V. (2013). Estimation of the concordance correlation coefficient for repeated measures using SAS and R. Computer Methods and Programs in Biomedicine, 109(3),293-304

Landis JR, Koch GG. (1977). The measurement of observer agreement for categorical data. Biometrics. 33(1),159-74.

Koo, T. K., Li, M. Y. (2016). A Guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163.

Lin, L. I. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268.

Vanbelle S. (2017). Comparing dependent kappa coefficients obtained on multilevel data. Biometrical Journal 59(5), 1016-1034