Browse > Article
http://dx.doi.org/10.5351/KJAS.2021.34.2.141

Undecided inference using the difference of AUCs  

Hong, Chong Sun (Department of Statistics, Sungkyunkwan University)
Na, Hae Rin (Department of Statistics, Sungkyunkwan University)
Publication Information
The Korean Journal of Applied Statistics / v.34, no.2, 2021 , pp. 141-152 More about this Journal
Abstract
A new statistical model needs additional variables in order to re-evaluate the undecided inference. Then the MNAR assumption is required, since the probabilities for the positivity of the indeterminant and the determinant is calculated differently. In this study, since two statistical models have a hierarchical relationship, we determine the undecided inference under the MNAR assumption using the confidence interval of the difference between two AUCs. Among many methods of estimating the confidence interval of the AUC difference, it is found that four kinds of methods show excellent performance through simulations. And based on these methods, we propose a variable selection method that are useful for the undecided inference using logistic regression models.
Keywords
hierarchy; indeterminant; logistic regression; missing; variable selection;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Li, C. R., Liao, C. T., and Liu, J. P. (2008). On the exact interval estimation for the difference in paired areas under the ROC curves, Statistics in Medicine, 27, 224-242.   DOI
2 Hand, D. J. (2001). Reject inference in credit operations, Handbook of Credit Scoring, 225-240.
3 Bandos, A. I., Rockette, H. E., and Gur, D. (2007). Exact bootstrap variances of the area under ROC curve. Communications in Statistics-Theory and Methods, 36, 2443-2461.   DOI
4 Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30, 1145-1159.   DOI
5 Centor, R. M. (1991). Signal detectability: the use of ROC curves and their analyses. Medical decision making, 11, 102-106.   DOI
6 DeLong, E. R., DeLong, D. M. and Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach, Biometrics, 44, 837-845.   DOI
7 Egan, J. P. (1975). Signal Detection Theory and ROC-Analysis, Academic Press.
8 Hanley, J. A. and McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve, Radiology, 143, 29-36.   DOI
9 Engelmann, B., Hayden, E., and Tasche, D. (2003). Testing rating accuracy, Risk, 16, 82-86.
10 Feelders, A. J. (2000). Credit scoring and reject inference with mixture models, International Journal of Intelligent System in Accounting, 8, 271-279.
11 Hanley, J. A. and McNeil, B. J. (1983). A method of comparing the areas under receiver operating characteristic curves derived from the same cases, Radiology, 148, 839-843.   DOI
12 Hong, C. S. and Jung, M. H. (2011a). Undecided inference using bivariate probit models, Journal of the Korean Data and Information Science Society, 22, 1017-1028.
13 Kim, H. Y. (2010). A comparison of the interval estimations for the difference in paired areas under the ROC curves, Communications for Statistical Applications and Methods, 17, 275-292.   DOI
14 Hong, C. S., Jeon, H. S., and Shin, H. S. (2019). Threshold interval for linear combination scores maximizing the partial AUC and VUS, The Korean Data and Information Science Society, 30, 759-770.   DOI
15 Hong, C. S., Jung, E. S., and Jung, D. G. (2013). Standard criterion of VUS for ROC surface, The Korean Journal of Applied Statistics, 26, 977-985.   DOI
16 Joseph, M. P. (2005). A PD validation framework for Basel II internal ratings-based systems, Credit Scoring and Credit Control IV.
17 Metz, C. E. (1978). Basic principles of ROC analysis, In Seminars in Nuclear Medicine, 8, 283-298.   DOI
18 Pepe, M. S. and Thompson, M. L. (2000). Combining diagnostic test results to increase accuracy. Biostatistics, 1, 123-140.   DOI
19 Pepe, M. S., Cai, T., and Longton, G. (2006). Combining predictors for classification using the area under the receiver operating characteristic curve, Biometrics, 62, 221-229.   DOI
20 Pepe, M. S., Kerr, K. F., Longton, G., and Wang, Z. (2013). Testing for improvement in prediction model performance, Statistics in Medicine, 32, 1467-1482.   DOI
21 Vuk, M. and Curk, T. (2006). ROC curve, lift chart and calibration plot, Metodoloski Zvezki, 3, 89.
22 Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231.   DOI
23 Su, J. Q. and Liu, J. S. (1993). Linear combinations of multiple diagnostic markers, Journal of the American Statistical Association, 88, 1350-1355.   DOI
24 Swets, J. A. (1988). Measuring the accuracy of diagnostic systems, Science, 240, 1285-1293.   DOI
25 Yang, H., Lu, K., Lyu, X., and Hu, F. (2019). Two-way partial AUC and its properties, Statistical Methods in Medical Research, 28, 184-195.   DOI
26 Hong, C. S. and Won, C. H. (2016). Parameter estimation for the imbalanced credit scoring data using AUC maximization. The Korean Journal of Applied Statistics, 29, 309-319.   DOI
27 Heller, G., Seshan, V. E., Moskowitz, C. S., and Gonen, M. (2017). Inference for the difference in the area under the ROC curve derived from nested binary regression models, Biostatistics, 18, 260-274.   DOI
28 Hong, C. S. and Jung, M. S. (2011b). Undecided inference using logistic regression for credit evaluation. Journal of the Korean Data and Information Science Society, 22, 149-157.