Browse > Article
http://dx.doi.org/10.5351/CSAM.2015.22.4.377

Test Statistics for Volume under the ROC Surface and Hypervolume under the ROC Manifold  

Hong, Chong Sun (Department of Statistics, Sungkyunkwan University)
Cho, Min Ho (Department of Statistics, Sungkyunkwan University)
Publication Information
Communications for Statistical Applications and Methods / v.22, no.4, 2015 , pp. 377-387 More about this Journal
Abstract
The area under the ROC curve can be represented by both Mann-Whitney and Wilcoxon rank sum statistics. Consider an ROC surface and manifold equal to three dimensions or more. This paper finds that the volume under the ROC surface (VUS) and the hypervolume under the ROC manifold (HUM) could be derived as functions of both conditional Mann-Whitney statistics and conditional Wilcoxon rank sum statistics. The nullhypothesis equal to three distribution functions or more are identical can be tested using VUS and HUM statistics based on the asymptotic large sample theory of Wilcoxon rank sum statistics. Illustrative examples with three and four random samples show that two approaches give the same VUS and $HUM^4$. The equivalence of several distribution functions is also tested with VUS and $HUM^4$ in terms of conditional Wilcoxon rank sum statistics.
Keywords
manifold; Mann-Whitney; nonparametric; ROC; surface; Wilcoxon;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology, 12, 387-415.   DOI
2 Conover, W. J. (1980). Practical Nonparametric Statistics, John Wiley & Sons, New York.
3 Egan, J. P. (1975). Signal Detection Theory and ROC Analysis, Academic Press, New York.
4 Engelmann, B., Hayden, E. and Tasche, D. (2003). Testing rating accuracy, Risk, 16, 82-86.
5 Faraggi, D. and Reiser, B. (2002). Estimation of the area under the ROC curve, Statistics in Medicine, 21, 3093-3106.   DOI   ScienceOn
6 Fawcett, T. (2003). ROC graphs: Notes and practical considerations for data mining researchers, HP Labs Tech Report HPL-2003-4, Available from: http://www.hpl.hp.com/techreports/2003/HPL-2003-4.pdf
7 Gibbons, J. D. (1971). Nonparametric Statistical Inference, McGraw-Hill, New York.
8 Hanley, J. A. and McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve, Radiology, 143, 29-36.   DOI
9 Hong, C. S. (2009). Optimal threshold from ROC and CAP curves, Communications in Statistics- Simulation and Computation, 38, 2060-2072.   DOI   ScienceOn
10 Hong, C. S. and Cho, M. H. (2015). VUS and HUM represented with Mann-Whitney statistic, Communications for Statistical Applications and Methods, 22, 223-232.   DOI   ScienceOn
11 Hong, C. S., Joo, J. S. and Choi, J. S. (2010). Optimal thresholds from mixture distributions, The Korean Journal of Applied Statistics, 23, 13-28.   DOI   ScienceOn
12 Hong, C. S., Jung, E. S. and Jung, D. G. (2013). Standard criterion of VUS for ROC surface, The Korean Journal of Applied Statistics, 26, 977-985.   DOI   ScienceOn
13 Hong, C. S. and Jung, D. G. (2014). Standard criterion of hypervolume under the ROC manifold, Journal of the Korean Data & Information Science Society, 25, 473-483.   DOI   ScienceOn
14 Joseph, M. P. (2005). A PD validation framework for Basel II internal ratings-based systems, Available from: http://www.business-school.ed.ac.uk/waf/crcarchive/2005/papers/joseph-maurice.pdf
15 Mann, H. B. and Whitney, D. R. (1947). On a test whether one of two random variables is stochasti- cally larger than the other, Annals of Mathematical Statistics, 18, 50-60.   DOI
16 Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231.   DOI
17 Randles, R. H. and Wolfe, D. A. (1979). Introduction to the Theory of Nonparametric Statistics, John Wiley & Sons, New York.
18 Rosset, S. (2004). Model selection via the AUC, In Proceedings of the 21st International Conference of Machine Learning, Banff, Canada.
19 Sobehart, J. R. and Keenan, S. C. (2001). Measuring default accurately, Risk: Credit Risk Special Report, 14, S31-S33.
20 Swets, J. A. (1988). Measuring the accuracy of diagnostic systems, Science, 240, 1285-1293.   DOI
21 Swets, J. A., Dawes, R. M. and Monahan, J. (2000). Better decisions through science, Scientific American, 283, 82-87.
22 Wilcoxon, F. (1945). Individual comparisons by ranking methods, Biometrics Bulletin, 1, 80-83.   DOI
23 Wilkie, A. D. (2004). Measures for comparing scoring systems. In L. C. Thomas, D. B. Edelman, and J. N. Crook (Eds.), Readings in Credit Scoring, Oxford University Press, Oxford, 51-62.
24 Zou, K. H., O′Malley, A. J. and Mauri, L. (2007). Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models, Circulation, 115, 654-657.   DOI   ScienceOn