DOI QR코드

DOI QR Code

Selection of markers in the framework of multivariate receiver operating characteristic curve analysis in binary classification

  • Received : 2018.08.03
  • Accepted : 2019.01.07
  • Published : 2019.03.31

Abstract

Classification models pertaining to receiver operating characteristic (ROC) curve analysis have been extended from univariate to multivariate setup by linearly combining available multiple markers. One such classification model is the multivariate ROC curve analysis. However, not all markers contribute in a real scenario and may mask the contribution of other markers in classifying the individuals/objects. This paper addresses this issue by developing an algorithm that helps in identifying the important markers that are significant and true contributors. The proposed variable selection framework is supported by real datasets and a simulation study, it is shown to provide insight about the individual marker's significance in providing a classifier rule/linear combination with good extent of classification.

Keywords

References

  1. Balaswamy S, Vishnu Vardhan R, and Rao MB (2014). A divergence measure for STROC curve in binary classification, Journal of Advanced Computing, 3, 68-81.
  2. Bamber D (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology, 12, 387-415. https://doi.org/10.1016/0022-2496(75)90001-2
  3. Faraggi D and Reiser B (2002). Estimation of the area under the ROC curve, Statistics in Medicine, 21, 3093-3106. https://doi.org/10.1002/sim.1228
  4. Gao F, Xiong C, Yan Y, Yu K, and Zhang Z (2008). Estimating optimum linear combination of multiple correlated diagnostic tests at a fixed specificity with receiver operating characteristic curves, Journal of Data Science, 6, 1-13.
  5. Guilherme de Alencar Barreto and Ajalmar RAago da Rocha Neto (2011). UCI Machine Learning Repository. Fortaleza, Ceara, Brazil: Department of Teleinformatics Engineering, Federal University of Ceara. Available from: https://archive.ics.uci.edu/ml/datasets/Vertebral+Column
  6. Hanley JA and McNeil BJ (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve, Radiology, 143, 29-36. https://doi.org/10.1148/radiology.143.1.7063747
  7. Hanley JA and McNeil BJ (1983). A method of Comparing the Areas under Receiver Operating Characteristic Curves Derived from the Same Cases, Radiology, 148, 839-843. https://doi.org/10.1148/radiology.148.3.6878708
  8. Liu A, Schisterman EF, and Zhu Y (2005). On linear combinations of Biomarkers to improve diagnostic accuracy, Statistics in Medicine, 24, 37-47 https://doi.org/10.1002/sim.1922
  9. Metz CE (1978). Basic principles of ROC analysis, Seminars in Nuclear Medicine, 8, 283-298.
  10. Michie D, Spiegelhalter DJ, and Taylor CC (1994). UCI Machine Learning Repository. Machine Learning, Neural and Statistical Classification, Ellis Horwood Limited. Available from: https://archive.ics.uci.edu/ml/datasets/Statlog+%28Heart%29
  11. Norton SJ, Gorga MP, Widen JE, et al. (2000). Identification of neonatal hearing impairment: evaluation of transient evoked ototacoustic emission, distortion product otoacoustic emission, and auditory brain stem response test performance, Ear Hearing, 21, 508-528. https://doi.org/10.1097/00003446-200010000-00013
  12. Sameera G and Vishnu Vardhan R (2016). Testing the precision of linear combination of an multivariate ROC (MROC) model. In Proceedings of National Conference entitled "Statistical Modelling and Analysis Techniques", NAROSA Publications, 103-110.
  13. Sameera G, Vishnu Vardhan R, and Sarma KVS (2016). Binary classification using multivariate receiver operating characteristic curve for continuous data, Journal of Biopharmaceutical Statistics, 26, 421-431. https://doi.org/10.1080/10543406.2015.1052479
  14. Su JQ and Liu JS (1993). Linear combinations of multiple diagnostic markers, Journal of American Association, 88, 1350-1355. https://doi.org/10.1080/01621459.1993.10476417
  15. Vishnu Vardhan R and Kiruthika C (2015). Properties and the use of half normal distribution in ROC curve analysis, IAPQR Transactions, 39, 169-179.
  16. Vishnu Vardhan R and Sarma KVS (2010). Estimation of the area under the ROC curve using confidence intervals of mean, ANU Journal of Physical Sciences, 2, 29-39.
  17. Zhang B (2006). A semi parametric hypothesis testing procedure for the ROC curve area under a density ratio model, Computational Statistics and Data Analytics, 50, 1855-1876. https://doi.org/10.1016/j.csda.2005.02.001