Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

Public Library of Science full text link Public Library of Science Free PMC article
Full text links

Actions

Share

.2019 Sep 26;14(9):e0222916.
doi: 10.1371/journal.pone.0222916. eCollection 2019.

Why Cohen's Kappa should be avoided as performance measure in classification

Affiliations

Why Cohen's Kappa should be avoided as performance measure in classification

Rosario Delgado et al. PLoS One..

Abstract

We show that Cohen's Kappa and Matthews Correlation Coefficient (MCC), both extended and contrasted measures of performance in multi-class classification, are correlated in most situations, albeit can differ in others. Indeed, although in the symmetric case both match, we consider different unbalanced situations in which Kappa exhibits an undesired behaviour, i.e. a worse classifier gets higher Kappa score, differing qualitatively from that of MCC. The debate about the incoherence in the behaviour of Kappa revolves around the convenience, or not, of using a relative metric, which makes the interpretation of its values difficult. We extend these concerns by showing that its pitfalls can go even further. Through experimentation, we present a novel approach to this topic. We carry on a comprehensive study that identifies an scenario in which the contradictory behaviour among MCC and Kappa emerges. Specifically, we find out that when there is a decrease to zero of the entropy of the elements out of the diagonal of the confusion matrix associated to a classifier, the discrepancy between Kappa and MCC rise, pointing to an anomalous performance of the former. We believe that this finding disables Kappa to be used in general as a performance measure to compare classifiers.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Agreement between MCC andKappa forC0.
Unbalanced case with underrepresentation of the negative class, which is perfectly classified. (a) Witha =d = 1, as function ofb: positive class mainly misclassified. (b) Withb =d = 1 as function ofa: positive class mainly well classified.
Fig 2
Fig 2. Disagreement between MCC andKappa forC1,aa,b witha = 0.2, as function ofb ≥ 0.
Ifb > 1, the negative class is underrepresented and quite misclassified, and the positive class is mainly misclassified. (a) A zoom of the detail forb ≤ 2. (b) Forb ≤ 30.
Fig 3
Fig 3. Disagreement between MCC andKappa forC1,0a,b witha = 1, as function ofb ≥ 0.
Ifb > 1, the negative class is underrepresented and systematically misclassified, and the positive class is also mainly misclassified. (a) A zoom of the detail forb ≤ 2. (b) Forb ≤ 30.
Fig 4
Fig 4. Disagreement between MCC andKappa forC1,1a,b witha = 0.2, as function ofb ≥ 0.
The negative class is classified at random. Ifb > 1 the positive class is mainly misclassified, and the negative class is underrepresented. (a) A zoom of the detail forb ≤ 2. (b) Forb ≤ 30.
Fig 5
Fig 5. Disagreement between MCC andKappa forZA, for different values ofN.
(a)N = 2, a zoom of the detail forA ≤ 5. (b)N = 2,A ≤ 100. (c)N = 5,A ≤ 500. (d)N = 10,A ≤ 1000.
Fig 6
Fig 6. Experimental agreement between MCC andKappa forM1(A).
Increasing asymmetry but constant entropy.
Fig 7
Fig 7. Experimental disagreement between MCC andKappa forM2(A).
Decreasing to zero entropy, which implies increasing asymmetry.
Fig 8
Fig 8. Experimental agreement between MCC andKappa forM3(A).
Decreasing entropy to a positive limit and constant asymmetry.
Fig 9
Fig 9. Experimental disagreement between MCC andKappa forM4(A).
Entropy decreases to zero, which implies that asymmetry increases, forA increasing from 50 to 100, and fromA decreasing from 50 to 0, by symmetry.
See this image and copyright information in PMC

Similar articles

See all similar articles

Cited by

See all "Cited by" articles

References

    1. Ferri C., Hernández-Orallo J., Modroiu R.: An experimental comparison of performance measures for classification. Pattern Recognition Letters 30(1), 27–38 (2009) 10.1016/j.patrec.2008.08.010 - DOI
    1. Jurman G., Riccadonna S., Furlanello C.: A comparison of mcc and cen error measures in multi-class prediction. PloS one 7(8), e41882 (2012) 10.1371/journal.pone.0041882 - DOI - PMC - PubMed
    1. Sokolova M., Lapalme G.: A systematic analysis of performance measures for classification tasks. Information Processing & Management 45(4), 427–437 (2009) 10.1016/j.ipm.2009.03.002 - DOI
    1. Matthews B.W.: Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure 405(2), 442–451 (1975) 10.1016/0005-2795(75)90109-9 - DOI - PubMed
    1. Gorodkin J.: Comparing two k-category assignments by a k-category correlation coefficient. Computational biology and chemistry 28(5-6), 367–374 (2004) 10.1016/j.compbiolchem.2004.09.006 - DOI - PubMed

Publication types

MeSH terms

Grants and funding

The authors are supported by Ministerio de Ciencia, Innovación y Universidades del Gobierno de España, project ref. PGC2018 - 097848 - B - I0.

LinkOut - more resources

Full text links
Public Library of Science full text link Public Library of Science Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp