DOI QR코드

DOI QR Code

Partial AUC and optimal thresholds

부분 AUC와 최적분류점들

  • Received : 2019.02.11
  • Accepted : 2019.02.27
  • Published : 2019.04.30

Abstract

Extensive literature exists on how to estimate optimal thresholds based on various accuracy measures using receiver operating characteristic (ROC) and cumulative accuracy profile (CAP) curves. This paper now proposes an alternative measure to represented the specific partial area under the ROC and CAP curves. The relationship between ROC and CAP functions is examined using differential equations of the new defined partial area under curves. In addition, the relationship with the optimal thresholds under conditions of various accuracy measures for the ROC and CAP functions is also derived. We assume there are two kinds of distribution functions composing the mixed distribution as various normal distributions before finding the optimal thresholds. Corresponding type 1 and 2 errors are also explored and discussed under various conditions for accuracy measures.

ROC와 CAP 곡선을 이용하여 다양한 정확도 측도를 바탕으로 최적분류점을 추정하는 많은 연구가 있다. 본 연구에서는 ROC와 CAP 곡선의 특정한 부분 면적을 나타내는 대안적인 통계량을 제안한다. 새롭게 정의된 부분 면적을 나타내는 통계량의 미분방정식을 이용하여 ROC와 CAP 함수와의 관계를 살펴보고, 다음으로는 ROC와 CAP 곡선에 대한 다양한 정확도 측도들의 조건에서의 최적분류점과의 관계를 유도한다. 혼합분포를 구성하는 두 종류의 분포함수를 다양한 정규분포로 가정하여 최적분류점을 설정하고, 다양한 정확도 측도들의 조건에서의 최적분류점에 대응하는 제1종과 제2종 오류의 크기를 탐색하고 토론한다.

Keywords

GCGHDE_2019_v32n2_187_f0001.png 이미지

Figure 4.1. Optimal thresholds.

GCGHDE_2019_v32n2_187_f0002.png 이미지

Figure 4.2. Values of errors.

GCGHDE_2019_v32n2_187_f0003.png 이미지

Figure 4.3. Optimal thresholds.

GCGHDE_2019_v32n2_187_f0004.png 이미지

Figure 4.4. Values of errors.

Table 2.1. The second derivatives of AUROC(u) and AUCAP(u)

GCGHDE_2019_v32n2_187_t0001.png 이미지

Table 4.1. Optimal thresholds

GCGHDE_2019_v32n2_187_t0002.png 이미지

Table 4.2. Values of α, β, α + β

GCGHDE_2019_v32n2_187_t0003.png 이미지

Table 4.3. Optimal thresholds

GCGHDE_2019_v32n2_187_t0004.png 이미지

Table 4.4. Values of α, β, α + β

GCGHDE_2019_v32n2_187_t0005.png 이미지

References

  1. Berry, M. J. A. and Linoff, G. (1999). Data Mining Techniques: For Marketing, Sales, and Customer Support (3rd ed), John Wiley & Sons, New York.
  2. Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30, 1145-1159. https://doi.org/10.1016/S0031-3203(96)00142-2
  3. Brasil, P. (2010). DiagnosisMed: Diagnostic test accuracy evaluation for medical professionals, Package Diagnosis Med in R, from: http://www.CRAN.R-project.org/src/contrib/Archive/DiagnosisMed
  4. Centor, R. M. (1991). Signal detectability: the use of ROC analysis, Med Decision Making, 11, 102-106. https://doi.org/10.1177/0272989X9101100205
  5. Connell, F. A. and Koepsell, T. D. (1985). Measures of gain in certainty from a diagnostic test, American Journal of Epidemiology, 121, 744-753. https://doi.org/10.1093/aje/121.5.744
  6. Dodd, L. E. and Pepe, M. S. (2003). Partial AUC Estimation and Regression, Biometrics, 59, 614-623. https://doi.org/10.1111/1541-0420.00071
  7. Engelmann, B., Hayden, E. and Tasche, D. (2003). Measuring the discriminative power of rating systems, Discussion paper, Series 2: Banking and Financial Supervision, No. 01/2003.
  8. Fawcett, T. (2003). ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, HP Laboratories, Palo Alto, HPL-2003-4.
  9. Fawcett, T. (2006). An introduction to ROC analysis, Pattern Recognition Letters, 27, 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  10. Fawcett, T. and Provost, F. (1997). Analysis and visualization of classier performance comparison under imprecise class and cost distributions, Knowledge Discovery and Data Mining, 97, 43-48.
  11. Fawcett, T. and Provost, F. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231. https://doi.org/10.1023/A:1007601015854
  12. Hong, C. S. and Choi, J. S. (2009). Optimal Threshold from ROC and CAP Curves, The Korean Journal of Applied Statistics, 22, 911-921. https://doi.org/10.5351/KJAS.2009.22.5.911
  13. Hong, C. S., Joo, J. S., and Choi, J. S. (2010). Optimal threshold from mixture distributions, The Korean Journal of Applied Statistics, 23, 13-28. https://doi.org/10.5351/KJAS.2010.23.1.013
  14. Irwin, J. R. and Irwin, C. T. (2012). Appraising Credit Ratings: Does the CAP Fit Better than the ROC?, International Monetary Fund Working paper, WP. 12/122.
  15. Jiang, Y., Metz, C., and Nishikawa, R. (1996). A receiver operating characteristic partial area index for highly sensitive diagnostic tests, Radiology, 201, 745-750. https://doi.org/10.1148/radiology.201.3.8939225
  16. Joseph, M. P. (2005). A PD validation framework for Basel II internal ratings-based systems, Credit Scoring and Credit Control IV.
  17. Krzanowski, W. J. and Hand, D, J. (2009). ROC Curves for Continuous Data, CRC Press, New York.
  18. Lambert, J. and Lipkovich, I. (2008). A macro for getting more out of your ROC curve, SAS Global Forum 2008, San Antonio, Paper 231-2008.
  19. Moses, L. E., Shapiro, D., and Littenberg, B. (1993). Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations, Statistics in Medicine, 12, 1293-1316. https://doi.org/10.1002/sim.4780121403
  20. Pepe, M. S. (2000). Receiver operating characteristic methodology, Journal of the American Statistical Association, 95, 308-311. https://doi.org/10.1080/01621459.2000.10473930
  21. Pepe, M. S. (2003). The Statistical Evaluation of Medical Test for Classification and Prediction (17th ed), Oxford University Press, Oxford.
  22. Perkins, N. J. and Schisterman, E. F. (2006). The inconsistency of "Optimal" cutpoints obtained using two criteria based on the receiver operating characteristic curve, American Journal of Epidemiology, 163, 670-675. https://doi.org/10.1093/aje/kwj063
  23. Sobehart, J. R., Keenan, S. C., and Stein, R. M. (2000). Benchmarking quantitative default risk models: a validation methodology, Moody's Investors Service.
  24. Swets, J. A., Dawes, R. M., and Monahan, J. (2000). Better Decisions through Science, Scientific American, 82-87.
  25. Swets, J. A. (1988). Measuring the accuracy of diagnostic systems, American Association for the Advancement of Science, 240, 1285-1293. https://doi.org/10.1126/science.3287615
  26. Tasche, D. (2006). Validation of internal rating systems and PD estimates, arXiv.org, eprint arXiv: physics/0606071.
  27. Velez, D. R., White, B. C., Motsinger, A. A., Bush, W. S., Ritchie, M. D., Williams, S. M., and Moore, J. H. (2007). A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction, Genetic Epidemiology, 31, 306-315. https://doi.org/10.1002/gepi.20211
  28. Vuk, M. and Curk, T. (2006). ROC curve, lift chart and calibration plot, Metodoloski Zvezki, 3, 89-108.
  29. Yoo, H. S. and Hong, C. S. (2011). Optimal criterion of classification accuracy measures for normal mixture, The Korean Journal of Applied Statistics, 18, 343-355. https://doi.org/10.5351/KJAS.2005.18.2.343
  30. Youden, W. J. (1950). Index for rating diagnostic tests, Cancer, 3, 32-35. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
  31. Zou, K. H. (2002). Receiver operating characteristic literature research, from: http://www.spl.harvard.edu/pages/ppl/zou/roc.html
  32. Zweig, M. H. and Campbell, G. (1993). Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine, Clinical Chemistry, 39, 561-577. https://doi.org/10.1093/clinchem/39.4.561