DOI QR코드

DOI QR Code

이변량 ROC곡선

Bivariate ROC Curve

  • 홍종선 (성균관대학교 통계학과) ;
  • 김강천 (성균관대학교 응용통계연구소) ;
  • 정진아 (성균관대학교 응용통계연구소)
  • Hong, C.S. (Department of Statistics, Sungkyunkwan University) ;
  • Kim, G.C. (Research Institute of Applied Statistics, Sungkyunkwan University) ;
  • Jeong, J.A. (Research Institute of Applied Statistics, Sungkyunkwan University)
  • 투고 : 2011.11.08
  • 심사 : 2012.02.27
  • 발행 : 2012.03.31

초록

신용평가모형에서 부도로 잘못 예측된 정상 차주의 비율과 정확하게 평가된 부도차주의 비율인 일변량 누적분포함수로 표현된 ROC 곡선을 이용하여 분류성과를 평가한다. 본 연구에서는 스코어 확률변수를 이변량으로 확장하여 부도와 정상 차주의 결합누적분포함수를 이용하여 표현할 수 있는 ROC 곡선을 제안한다. 이변량 평균벡터를 통과하는 확률변수의 선형 관계를 이용하여 이변량 ROC 곡선을 구현한다. 그리고 다양한 이변량 정규분포에 대한 ROC 곡선으로부터 분류성과를 탐색하고, 이에 대응하는 AUROC 통계량과 비교분석한다. 본 연구에서 제안한 이변량 ROC 곡선으로부터 분류기준에 적합한 최적분류점을 구하고 이를 통해 이변량 혼합분포함수의 최적 분류기준을 설정할 수 있음을 보인다.

For credit assessment models, the ROC curves evaluate the classification performance using two univariate cumulative distribution functions of the false positive rate and true positive rate. In this paper, it is extended to two bivariate normal distribution functions of default and non-default borrowers; in addition, the bivariate ROC curves are proposed to represent the joint cumulative distribution functions by making use of the linear function that passes though the mean vectors of two score random variables. We explore the classification performance based on these ROC curves obtained from various bivariate normal distributions, and analyze with the corresponding AUROC. The optimal threshold could be derived from the bivariate ROC curve using many well known classification criteria and it is possible to establish an optimal cut-off criteria of bivariate mixture distribution functions.

키워드

참고문헌

  1. 홍종선, 이원용(2011). 정규혼합분포를 이용한 ROC분석, <응용통계연구>, 24, 269-278.
  2. 홍종선, 주재선(2010). 혼합분포에서 최적분류점, <응용통계연구>, 23, 13-28. https://doi.org/10.5351/KJAS.2010.23.1.013
  3. 홍종선, 최진수(2010). ROC와 CAP곡선에서의 최적분류점, <응용통계연구>, 22, 911-921.
  4. Centor, R. M. (1991). Signal detectability: The use of ROC curve and their analyses, Medical Decision Making, 11, 102-106. https://doi.org/10.1177/0272989X9101100205
  5. Connell, F. A. and Koepsell, T. D. (1985). Measures of gain in certainty from a diagnostic test, American Journal of Epidemiology, 121, 744-753. https://doi.org/10.1093/aje/121.5.744
  6. Fawcett, T. (2003). ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, HP Laboratories,1501 Page Mill Road, Palo Alto, CA 94304.
  7. Hanley, A. and McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristic curve, Diagnostic Radiology, 143, 29-36.
  8. Krzanowski,W. J. and Hand, D. J. (2009). ROC Curves for Continuous Data, Chapman & Hall/CRC, Boca Raton, Florida. Has Been Selected, Clinical Chemistry, 32, 1341-1346.
  9. Lambert, J. and Lipkovich, I. (2008). A macro for getting more out of your ROC curve, SAS Grobal Forum, 231.
  10. Perkins, N. J. and Schisterman, E. F. (2006). The inconsistency of "Optimal" cutpoints obtained using two criteria based on the receiver operating characteristic curve, American Journal of Epidemiology, 163, 670-675. https://doi.org/10.1093/aje/kwj063
  11. Provost, F. and Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions, Proceeding of the Third International Conference on Knowledge Discovery and Data Mining, 43-48.
  12. Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environment, Machine Learning, 42, 203-231. https://doi.org/10.1023/A:1007601015854
  13. Swets, J. A. (1988). Measuring the accuracy of diagnostic systems, American Association for the Advancement of Science, 240, 1285-1293. https://doi.org/10.1126/science.3287615
  14. Tasche, D. (2006). Validation of internal rating systems and PD estimates, arXiv.org, eprint arXiv:physics/0606071.
  15. Youden, W. J. (1950). Index for rating diagnostic test, Cancer, 3, 32-35. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
  16. Zou, K. H. (2002). Receiver operating characteristic literature research, On-line bibliography available from: http://www.spl.harvard.edu/pages/ppl/zou/roc.html.

피인용 문헌

  1. Bivariate ROC Curve and Optimal Classification Function vol.19, pp.4, 2012, https://doi.org/10.5351/CKSS.2012.19.4.629
  2. ROC curve and AUC for linear growth models vol.26, pp.6, 2015, https://doi.org/10.7465/jkdi.2015.26.6.1367
  3. Statistical Fingerprint Recognition Matching Method with an Optimal Threshold and Confidence Interval vol.25, pp.6, 2012, https://doi.org/10.5351/KJAS.2012.25.6.1027
  4. An Analysis of Diversion Rate by The types of Display and The levels of Delay on VMS (Variable Message Sign) vol.12, pp.6, 2013, https://doi.org/10.12815/kits.2013.12.6.054