DOI QR코드

DOI QR Code

Positive and negative predictive values by the TOC curve

  • 투고 : 2019.10.11
  • 심사 : 2020.01.18
  • 발행 : 2020.03.31

초록

Sensitivity and specificity are popular measures described by the receiver operating characteristic (ROC) curve. There are also two other measures such as the positive predictive value (PPV) and negative predictive value (NPV); however, the PPV and NPV cannot be represented by the ROC curve. Based on the total operating characteristic (TOC) curve suggested by Pontius and Si (International Journal of Geographical Information Science, 97, 570-583, 2014), explanatory methods are proposed to geometrically describe the PPV and NPV by the TOC curve. It is found that the PPV can be regarded as the slope of the right-angled triangle connecting the origin to a certain point on the TOC curve, while 1 - NPV can be represented as the slope of the right-angled triangle connecting a certain point to the top right corner of the TOC curve. When the neutral zone exists, the PPV and 1-NPV can be described as the slopes of two other right-angled triangles of the TOC curve. Therefore, both the PPV and NPV can be estimated using the TOC curve, whether or not the neutral zone is present.

키워드

참고문헌

  1. Bradley AP (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30, 1145-1159. https://doi.org/10.1016/S0031-3203(96)00142-2
  2. Daniel and Steven S (2016). Maximizing the usefulness of statistical classifiers for two populations with illustrative applications, Statistical Methods in Medical Research, 27, 2344-2358. https://doi.org/10.1177/0962280216680244
  3. Egan JP (1975). Signal detection theory and ROC analysis, Academic Press, New York.
  4. Engelmann B, Hayden E, and Tasche D (2003). Testing rating accuracy, Risk, 16, 82-86.
  5. Fawcett T (2003). ROC graphs: Notes and practical considerations for data mining researchers (Technical report), Available from: http://www.blogspot.udec.ugto.saedsayad.com/docs/ROC101.pdf
  6. Fawcett T (2006). An introduction to ROC analysis, Pattern Recognition Letters, 27, 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  7. Fawcett T and Provost F (1997). Adaptive fraud detection, Data Mining and Knowledge Discovery, 1, 291-316. https://doi.org/10.1023/A:1009700419189
  8. Garcia V, Mollineda RA, and Sanchez JS (2010). Theoretical analysis of a performance measure for imbalanced data, 20th International Conference on Pattern Recognition, 2010, 617-620.
  9. Hong CS, Kim JH, and Choi JS (2009). Adjusted ROC and CAP curves, Korean Journal of Applied Statistics, 22, 29-39. https://doi.org/10.5351/KJAS.2009.22.1.029
  10. Hong CS and Lee SJ (2018). TROC curve and accuracy measures, Journal of the Korean Data & Information Science Society, 29, 861-872. https://doi.org/10.7465/jkdi.2018.29.4.861
  11. Hsieh F and Turnbull BW (1996). Nonparametric and semiparametric estimation of the receiver operating characteristic curve, The Annals of Statistics, 24, 25-40. https://doi.org/10.1214/aos/1033066197
  12. Metz CE and Kronman HB (1980). Statistical significance tests for binormal ROC curves, Journal of Mathematical Psychology, 22, 218-243. https://doi.org/10.1016/0022-2496(80)90020-6
  13. Pepe MS (2000). Receiver operating characteristic methodology, Journal of the American Statistical Association, 95, 308-311. https://doi.org/10.1080/01621459.2000.10473930
  14. Pepe MS (2003). The statistical evaluation of medical tests for classification and prediction, Oxford University Press, Oxford.
  15. Pontius RG and Si K (2014). The total operating characteristic to measure diagnostic ability for multiple threshold, International Journal of Geographical Information Science, 97, 570-583. https://doi.org/10.1080/13658816.2013.862623
  16. Provost F and Fawcett T (1997). Analysis and visualization of classifier performance comparison under imprecise class and cost distributions, Knowledge Discovery and Data Mining, 97, 43-48.
  17. Provost F and Fawcett T (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231. https://doi.org/10.1023/A:1007601015854
  18. Raslich MA, Markert RJ, and Stutes SA (2007). Selecting and interpreting diagnostic tests, Biochemia Medica, 17, 139-270.
  19. Stein RM (2005). The relationship between default prediction and lending profits: Integrating ROC analysis and loan pricing, Journal of Banking & Finance, 29, 1213-1236. https://doi.org/10.1016/j.jbankfin.2004.04.008
  20. Zhou XH, Obuchowski NA, and McClish DK (2002). Statistical Methods in Diagnostic Medicine, Wiley-InterScience, New York.
  21. Zweig M and Campbell G (1993). Receiver-operating characteristics (ROC) plots: A fundamental evaluation tool in clinical medicine, Clinical Chemistry, 39, 561-577. https://doi.org/10.1093/clinchem/39.4.561