Browse > Article
http://dx.doi.org/10.29220/CSAM.2018.25.3.307

Applying a modified AUC to gene ranking  

Yu, Wenbao (Department of Statistics, Chonnam National University)
Chang, Yuan-Chin Ivan (Institute of Statistical Science, Academia Sinica)
Park, Eunsik (Department of Statistics, Chonnam National University)
Publication Information
Communications for Statistical Applications and Methods / v.25, no.3, 2018 , pp. 307-319 More about this Journal
Abstract
High-throughput technologies enable the simultaneous evaluation of thousands of genes that could discriminate different subclasses of complex diseases. Ranking genes according to differential expression is an important screening step for follow-up analysis. Many statistical measures have been proposed for this purpose. A good ranked list should provide a stable rank (at least for top-ranked gene), and the top ranked genes should have a high power in differentiating different disease status. However, there is a lack of emphasis in the literature on ranking genes based on these two criteria simultaneously. To achieve the above two criteria simultaneously, we proposed to apply a previously reported metric, the modified area under the receiver operating characteristic cure, to gene ranking. The proposed ranking method is found to be promising in leading to a stable ranking list and good prediction performances of top ranked genes. The findings are illustrated through studies on both synthesized data and real microarray gene expression data. The proposed method is recommended for ranking genes or other biomarkers for high-dimensional omics studies.
Keywords
gene ranking; Modified AUC ROC curve;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, and Levine AJ (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In Proceedings of the National Academy of Sciences, 96, 6745-6750.   DOI
2 Bamber D (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology, 12, 387-415.   DOI
3 Benjamini Y and Hochberg Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society. Series B (Methodological), 57, 289-300.
4 Boulesteix, AL and Slawski M (2009). Stability and aggregation of ranked gene lists, Briefings in Bioinformatics, 10, 556-568.   DOI
5 Cui X and Churchill GA (2003). Statistical tests for differential expression in cDNA microarray experiments, Genome Biology, 4, 210.   DOI
6 Cui X, Hwang JT, Qiu J, Blades NJ, and Churchill GA (2005). Improved statistical tests for differential gene expression by shrinking variance components estimates, Biostatistics, 6, 59-75.   DOI
7 De Alava E, Panizo A, Antonescu CR, Huvos AG, Pardo-Mindan FJ, Barr FG, and Ladanyi M (2000). Association of EWS-FLI1 type 1 fusion with lower proliferative rate in Ewing's sarcoma, The American Journal of Pathology, 156, 849-855.   DOI
8 Efron B, Tibshirani R, Storey JD, and Tusher V (2001). Empirical Bayes analysis of a microarray experiment, Journal of the American Statistical Association, 96, 1151-1160.   DOI
9 Furlanello C, Serafini M, Merler S, and Jurman G (2003). Entropy-based gene ranking without selec- tion bias for the predictive classification of microarray data, BMC bioinformatics, 4, 54.   DOI
10 Jeffery IB, Higgins DG, and Culhane AC (2006). Comparison and evaluation of methods for generat- ing differentially expressed gene lists from microarray data, BMC Bioinformatics, 7, 359.   DOI
11 Kuner R, Muley T, Meister M, et al. (2009). Global gene expression analysis reveals specific patterns of cell junctions in non-small cell lung cancer subtypes, Lung Cancer, 63, 32-38.   DOI
12 Newton MA, Noueiry A, Sarkar D, and Ahlquist P (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method, Biostatistics, 5, 155-176.   DOI
13 Noma H and Matsui S (2013). Empirical Bayes ranking and selection methods via semiparametric hierarchical mixture models in microarray studies, Statistics in Medicine, 32, 1904-1916.   DOI
14 Noma H, Matsui S, Omori T, and Sato T (2010). Bayesian ranking and selection methods using hierarchical mixture models in microarray studies, Biostatistics, 11, 281-289.   DOI
15 Pepe MS, Longton G, Anderson GL, and Schummer M (2003). Selecting differentially expressed genes from microarray experiments, Biometrics, 59, 133-142.   DOI
16 Sindhwani V, Bhattacharya P, and Rakshit S (2001). Information theoretic feature crediting in multiclass support vector machines. In Proceedings of the First SIAM International Conference on Data Mining, 5-7.
17 Tusher VG, Tibshirani R, and Chu G (2001). Significance analysis of microarrays applied to the ionizing radiation response. In Proceedings of the National Academy of Sciences, 98, 5116- 5121.   DOI
18 Smyth GK (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments, Statistical Applications in Genetics and Molecular Biology, 3, 3.
19 Storey JD (2003). The positive false discovery rate: a Bayesian interpretation and the q-value, Annals of Statistics, 31, 2013-2035.   DOI
20 Joober R, Benkelfat C, Toulouse A, et al. (1999). Analysis of 14 CAG repeat-containing genes in schizophrenia, American Journal of Medical Genetics (Neuropsychiatric Genetics), 88, 694-699.   DOI
21 Yu W, Chang YCI, and Park E (2014). A modified area under the ROC curve and its application to marker selection and classification, Journal of the Korean Statistical Society, 43, 161-175.   DOI
22 Yu WB, Park E, and Chang YCI (2015). Comparison of paired ROC curves through a two-stage test, Journal of Biopharmaceutical Statistics, 25, 881-902.   DOI