Original article
Behavior and interpretation of the κ statistic: Resolution of the two paradoxes

https://doi.org/10.1016/0895-4356(95)00571-4Get rights and content

Abstract

Two apparent paradoxes have been identified for the kappa (κ) statistic: (1) high levels of observer agreement with low κ values; (2) lack of predictability of changes in κ with changing marginals. The first paradox is a function of prevalence of the trait in the sample, while the second is related to symmetry of observations in the disagreement categories. While examining the behavior of κ as a function of the distribution of responses in a contingency table, it was discovered that for any measured level of observer agreement (Po) there are three characteristic values of κ: κmax, κmin, and κnor, each of which is a function only of Po. The characteristic values allow an observed kappa (κo) to be placed into perspective. By observing symmetry in agreement and disagreement categories, the behavior of κ is readily understood and predictable. We define symmetry expressions for agreement (SA) and disagreement (SD) in order to represent and quantify these effects. κ alone has little interpretive value and we recommend that studies reporting κ also report Po, SD, and P++ (agreement on the presence of the trait).

References (11)

There are more references available in the full text version of this article.

Cited by (258)

  • A deep learning approach for COVID-19 detection from computed tomography scans

    2022, Applications of Artificial Intelligence in Medical Imaging
  • Prevalence of intestinal parasites in dogs in southern Ontario, Canada, based on fecal samples tested using sucrose double centrifugation and Fecal Dx® tests

    2021, Veterinary Parasitology: Regional Studies and Reports
    Citation Excerpt :

    Based on Cohen's kappa, the level of agreement between the results obtained with the Fecal Dx® tests and the sucrose double centrifugation method was moderate for both roundworms and whipworms, and fair for hookworms. However, Cohen's kappa is dependent on prevalence and is affected by unbalanced marginal totals (i.e., when the apparent prevalence is far from 50%, so that the number of positive results is very different from the number of negative results), potentially resulting in an unreliable kappa value (Luntz and Nebenzahl, 1996; Dohoo et al., 2003). For this reason, it can be misleading to present Cohen's kappa alone (Banegas et al., 2015).

View all citing articles on Scopus
View full text