Agreement for polytomous outcomes

Last updated on Jan 9, 2024 7 min read

This document describes the use of the Agree package for two data examples that are used in the paper on specific agreement on polytomous outcomes in the situation of more than two raters (de Vet, Mullender, and Eekhout 2018). The first data example is an example of ordinal ratings and the second example of nominal rating.

library(Agree)

## 
## Attaching package: 'Agree'

## The following object is masked from 'package:base':
## 
##     kappa

Ordinal data example

For the ordinal data example we use data from a study by Dikmans et al. (2017). This data is based on photographs of breasts of 50 women after breast reconstruction. The photographs are independently scored by 5 surgeons, the patients, and three mothers. They each rated the quality of the reconstruction on a 5 point ordinal scale with the verbal anchors on the left side ‘very dissatisfied’ on the left end and on the right end ‘very satisfied’ on the right end. They specifically rated the volume, shape, symmetry, scars and nipple. In this paper we use the data of 4 surgeons because one surgeon had some missing values and we look at the rates for symmetry. Data set 1 is used for the example of ordinal categories.

data(breast)

variable <- "symmetry"
raters <- c("PCH1", "PCH2", "PCH3", "PCH4")
ratersvars <- paste(raters, variable, sep="_")
data1 <- data.frame(breast[ratersvars])

data1 %>% head()

##       PCH1_symmetry  PCH2_symmetry PCH3_symmetry PCH4_symmetry
## 1         satisfied very satisfied     satisfied     satisfied
## 2           neutral        neutral  dissatisfied       neutral
## 3         satisfied        neutral       neutral       neutral
## 4      dissatisfied        neutral  dissatisfied  dissatisfied
## 5 very dissatisfied      satisfied  dissatisfied     satisfied
## 6         satisfied        neutral     satisfied     satisfied

Agreement table

First the agreement table are summed for all rater combinations into one agreement table. Then the off diagonal cells are averaged to obtain symmetry agreement tables.

sumtable(data1,offdiag = FALSE) %>% kable()

	very dissatisfied	dissatisfied	neutral	satisfied	very satisfied
very dissatisfied	6	1	0	2	0
dissatisfied	0	19	23	8	1
neutral	0	18	39	27	4
satisfied	0	4	28	28	16
very satisfied	0	0	2	36	38

sumtable(data1,offdiag = TRUE) %>% kable()

	very dissatisfied	dissatisfied	neutral	satisfied	very satisfied
very dissatisfied	6.0	0.5	0.0	1.0	0.0
dissatisfied	0.5	19.0	20.5	6.0	0.5
neutral	0.0	20.5	39.0	27.5	3.0
satisfied	1.0	6.0	27.5	28.0	26.0
very satisfied	0.0	0.5	3.0	26.0	38.0

Agreement

From the agreement table we can calculate the agreement. And we can calculate the confidence interval around this agreement.

agreement(data1)

## overall agreement 
##         0.4333333

agreement(data1, confint = TRUE)

## overall agreement             lower             upper 
##         0.4333333         0.3286321         0.5434725

Specific agreement

The specific agreement for polytomous data, can be defined in two ways: the agreement of for one category versus not that category (e.g. very satisfied versus all other categories) or the agreement for one category versus any other (e.g. very satistfied versus satisfied). Below the Confidence intervals for the specific agreements are bootstrapped.

agreement(data1, specific="satisfied", confint = TRUE)

##                                       p     lower     upper
## overall agreement             0.4333333 0.3286321 0.5434725
## specific agreement: satisfied 0.3163842 0.2307576 0.4058084

agreement(data1, specific=c("satisfied", "very satisfied"), confint = TRUE)

##                                                         p     lower     upper
## overall agreement                               0.4333333 0.3286321 0.5434725
## specific agreement: satisfied vs very satisfied 0.5185185 0.3839506 0.6518599

agreement(data1, specific= c("satisfied","neutral"), confint = TRUE)

##                                                  p     lower     upper
## overall agreement                        0.4333333 0.3286321 0.5434725
## specific agreement: satisfied vs neutral 0.5045045 0.3716486 0.6207280

Conditional probability

We can calulate the probability of any other outcome conditional on an specific outcome.

conditional.agreement(data1) %>% kable()

	prevalence	proportion	very dissatisfied	dissatisfied	neutral	satisfied	very satisfied
very dissatisfied	7.5	0.025	0.800	0.067	0.000	0.133	0.000
dissatisfied	46.5	0.155	0.011	0.409	0.441	0.129	0.011
neutral	90.0	0.300	0.000	0.228	0.433	0.306	0.033
satisfied	88.5	0.295	0.011	0.068	0.311	0.316	0.294
very satisfied	67.5	0.225	0.000	0.007	0.044	0.385	0.563

Weighted agreement

For ordinal data it might also be useful to look at the agreement when they may be one category off. So the agreement plus or minus one category, that categories is weighted (default weight=1).

weighted.agreement(data1)

## [1] 0.93

weighted.agreement(data1, weight=0.5)

## [1] 0.6816667

nominal data example

For the nominal data example we use a data set that was used in a paper by Fleis (1971). In this data patients are diagnosed in 5 categories: Depression, Personality Disorder, Schizophrenia, Neurosis, and Other by 6 raters.

data(diagnoses) 
data2 <- data.frame(lapply(diagnoses,as.factor), stringsAsFactors = TRUE)

 levels(data2$rater1) <- c("Depression", "Pers disord.", "Schizophrenia", "Neurosis", "Other")
 levels(data2$rater2) <- c("Depression", "Pers disord.", "Schizophrenia", "Neurosis", "Other")
 levels(data2$rater3) <- c("Depression", "Pers disord.", "Schizophrenia", "Neurosis", "Other")
 levels(data2$rater4) <- c("Depression", "Pers disord.", "Schizophrenia", "Neurosis", "Other")
 levels(data2$rater5) <- c("Depression", "Pers disord.", "Schizophrenia", "Neurosis", "Other")

Agreement table

First the agreement table are summed for all rater combinations into one agreement table. Then the off diagonal cells are averaged to obtain symmetry agreement tables.

sumtable(data2,offdiag = FALSE) %>% kable()

	Depression	Pers disord.	Schizophrenia	Neurosis	Other	Personality Disorder	Schizophrenia	Neurosis	Other
Depression	23	1	11	20	1	4	3	9	8
Pers disord.	0	23	6	17	5	0	3	13	8
Schizophrenia	7	4	36	0	2	0	6	2	12
Neurosis	10	16	0	53	1	0	0	24	6
Other	8	5	6	2	43	0	0	0	22
Personality Disorder	1	0	0	0	0	0	0	0	0
Schizophrenia	0	0	3	0	0	0	0	0	0
Neurosis	0	1	1	10	0	0	0	0	0
Other	1	0	3	3	7	0	0	0	0

sumtable(data2,offdiag = TRUE) %>% kable()

	Depression	Pers disord.	Schizophrenia	Neurosis	Other	Personality Disorder	Schizophrenia	Neurosis	Other
Depression	23.0	0.5	9.0	15.0	4.5	2.5	1.5	4.5	4.5
Pers disord.	0.5	23.0	5.0	16.5	5.0	0.0	1.5	7.0	4.0
Schizophrenia	9.0	5.0	36.0	0.0	4.0	0.0	4.5	1.5	7.5
Neurosis	15.0	16.5	0.0	53.0	1.5	0.0	0.0	17.0	4.5
Other	4.5	5.0	4.0	1.5	43.0	0.0	0.0	0.0	14.5
Personality Disorder	2.5	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
Schizophrenia	1.5	1.5	4.5	0.0	0.0	0.0	0.0	0.0	0.0
Neurosis	4.5	7.0	1.5	17.0	0.0	0.0	0.0	0.0	0.0
Other	4.5	4.0	7.5	4.5	14.5	0.0	0.0	0.0	0.0

Agreement

From the agreement table we can calculate the agreement. And we can calculate the confidence interval around this agreement.

agreement(data2, confint = TRUE)

## overall agreement             lower             upper 
##         0.3955556         0.2805873         0.5200202

Specific agreement

The specific agreement for polytomous data, can be defined in two ways: the agreement of for one category versus not that category (e.g. Depression versus all other categories) or the agreement for one category versus any other (e.g. Depression versus Schizophrenia). The confidence intervals for specific agreement are bootstrapped.

agreement(data2, specific="Depression", confint = TRUE)

##                                        p     lower     upper
## overall agreement              0.3955556 0.2805873 0.5200202
## specific agreement: Depression 0.3538462 0.1230000 0.5161988

agreement(data2, specific="Pers disord.", confint = TRUE)

##                                          p     lower     upper
## overall agreement                0.3955556 0.2805873 0.5200202
## specific agreement: Pers disord. 0.3680000 0.1777778 0.5419355

agreement(data2, specific="Schizophrenia", confint = TRUE)

##                                           p     lower     upper
## overall agreement                 0.3955556 0.2805873 0.5200202
## specific agreement: Schizophrenia 0.5333333 0.3692308 0.6667647

agreement(data2, specific="Neurosis", confint = TRUE)

##                                      p     lower     upper
## overall agreement            0.3955556 0.2805873 0.5200202
## specific agreement: Neurosis 0.4930233 0.3764137 0.5857143

agreement(data2, specific="Other", confint = TRUE)

##                                   p     lower     upper
## overall agreement         0.3955556 0.2805873 0.5200202
## specific agreement: Other 0.5931034 0.2997222 0.7393939

Conditional agreement

conditional.agreement(data2) %>% kable()

	prevalence	proportion	Depression	Pers disord.	Schizophrenia	Neurosis	Other	Personality Disorder	Schizophrenia	Neurosis	Other
Depression	65.0	0.1444444	0.354	0.008	0.138	0.231	0.069	0.038	0.023	0.069	0.069
Pers disord.	62.5	0.1388889	0.008	0.368	0.080	0.264	0.080	0.000	0.024	0.112	0.064
Schizophrenia	67.5	0.1500000	0.133	0.074	0.533	0.000	0.059	0.000	0.067	0.022	0.111
Neurosis	107.5	0.2388889	0.140	0.153	0.000	0.493	0.014	0.000	0.000	0.158	0.042
Other	72.5	0.1611111	0.062	0.069	0.055	0.021	0.593	0.000	0.000	0.000	0.200
Personality Disorder	2.5	0.0055556	1.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
Schizophrenia	7.5	0.0166667	0.200	0.200	0.600	0.000	0.000	0.000	0.000	0.000	0.000
Neurosis	30.0	0.0666667	0.150	0.233	0.050	0.567	0.000	0.000	0.000	0.000	0.000
Other	35.0	0.0777778	0.129	0.114	0.214	0.129	0.414	0.000	0.000	0.000	0.000

References

de Vet, H. C. W., M. G. Mullender, and I Eekhout. 2018. “Specific Agreement on Ordinal and Multiple Nominal Outcomes Can Be Calculated for More Than Two Raters.” Journal of Clinical Epidemiology 96: 47–53. https://www.jclinepi.com/article/S0895-4356(16)30837-X/abstract.

Dikmans, R. E., L. Nene, M. B. Bouman, H. C. W. de Vet, M. Mireau, M. E. Buncamper, H. Winters, M. Ritt, and M. G. Mullender. 2017. “The Aesthetic Items Scale: A Tool for the Evaluation of Aesthetic Outcome After Breast Reconstruction.” Plastic and Reconstructive Surgery. Global Open. 5 (3): e1254.

Fleis, J. L. 1971. “Measuring Nominal Scale Agreement Among Many Raters.” Psychologiclal Bulletin 76 (5): 378–82.

Agreement raters

Agreement for polytomous outcomes

Ordinal data example

Agreement table

Agreement

Specific agreement

Conditional probability

Weighted agreement

nominal data example

Agreement table

Agreement

Specific agreement

Conditional agreement

References

Iris Eekhout, PhD

Statistician & Senior Scientist

Related