Reliability

Rater agreement & reliability

In studies where more than one rater gives a judgement on a certain characteristic, the agreement between the judgements is of interest. Historically, mostly a kappa statistic is used to assess the agreement. However, the kappa statistic is a reliability measure instead of an agreement measure. It is more informative to use the percentage of absolute agreement instead. The reliability of ratings can also be obtained via different methods. The choice between the ICC oneway, ICC consistency and ICC agreement depends on the study design and goals.