[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2018.25.3.307

Applying a modified AUC to gene ranking

Yu, Wenbao (Department of Statistics, Chonnam National University)
Chang, Yuan-Chin Ivan (Institute of Statistical Science, Academia Sinica)
Park, Eunsik (Department of Statistics, Chonnam National University)

Publication Information

Communications for Statistical Applications and Methods / v.25, no.3, 2018 , pp. 307-319 More about this Journal

Abstract

High-throughput technologies enable the simultaneous evaluation of thousands of genes that could discriminate different subclasses of complex diseases. Ranking genes according to differential expression is an important screening step for follow-up analysis. Many statistical measures have been proposed for this purpose. A good ranked list should provide a stable rank (at least for top-ranked gene), and the top ranked genes should have a high power in differentiating different disease status. However, there is a lack of emphasis in the literature on ranking genes based on these two criteria simultaneously. To achieve the above two criteria simultaneously, we proposed to apply a previously reported metric, the modified area under the receiver operating characteristic cure, to gene ranking. The proposed ranking method is found to be promising in leading to a stable ranking list and good prediction performances of top ranked genes. The findings are illustrated through studies on both synthesized data and real microarray gene expression data. The proposed method is recommended for ranking genes or other biomarkers for high-dimensional omics studies.

Keywords

gene ranking; Modified AUC ROC curve;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, and Levine AJ (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In Proceedings of the National Academy of Sciences, 96, 6745-6750. DOI
2	Bamber D (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology, 12, 387-415. DOI
3	Benjamini Y and Hochberg Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society. Series B (Methodological), 57, 289-300.
4	Boulesteix, AL and Slawski M (2009). Stability and aggregation of ranked gene lists, Brieﬁngs in Bioinformatics, 10, 556-568. DOI
5	Cui X and Churchill GA (2003). Statistical tests for diﬀerential expression in cDNA microarray experiments, Genome Biology, 4, 210. DOI
6	Cui X, Hwang JT, Qiu J, Blades NJ, and Churchill GA (2005). Improved statistical tests for diﬀerential gene expression by shrinking variance components estimates, Biostatistics, 6, 59-75. DOI
7	De Alava E, Panizo A, Antonescu CR, Huvos AG, Pardo-Mindan FJ, Barr FG, and Ladanyi M (2000). Association of EWS-FLI1 type 1 fusion with lower proliferative rate in Ewing's sarcoma, The American Journal of Pathology, 156, 849-855. DOI
8	Efron B, Tibshirani R, Storey JD, and Tusher V (2001). Empirical Bayes analysis of a microarray experiment, Journal of the American Statistical Association, 96, 1151-1160. DOI
9	Furlanello C, Seraﬁni M, Merler S, and Jurman G (2003). Entropy-based gene ranking without selec- tion bias for the predictive classiﬁcation of microarray data, BMC bioinformatics, 4, 54. DOI
10	Jeﬀery IB, Higgins DG, and Culhane AC (2006). Comparison and evaluation of methods for generat- ing diﬀerentially expressed gene lists from microarray data, BMC Bioinformatics, 7, 359. DOI
11	Kuner R, Muley T, Meister M, et al. (2009). Global gene expression analysis reveals speciﬁc patterns of cell junctions in non-small cell lung cancer subtypes, Lung Cancer, 63, 32-38. DOI
12	Newton MA, Noueiry A, Sarkar D, and Ahlquist P (2004). Detecting diﬀerential gene expression with a semiparametric hierarchical mixture method, Biostatistics, 5, 155-176. DOI
13	Noma H and Matsui S (2013). Empirical Bayes ranking and selection methods via semiparametric hierarchical mixture models in microarray studies, Statistics in Medicine, 32, 1904-1916. DOI
14	Noma H, Matsui S, Omori T, and Sato T (2010). Bayesian ranking and selection methods using hierarchical mixture models in microarray studies, Biostatistics, 11, 281-289. DOI
15	Pepe MS, Longton G, Anderson GL, and Schummer M (2003). Selecting diﬀerentially expressed genes from microarray experiments, Biometrics, 59, 133-142. DOI
16	Sindhwani V, Bhattacharya P, and Rakshit S (2001). Information theoretic feature crediting in multiclass support vector machines. In Proceedings of the First SIAM International Conference on Data Mining, 5-7.
17	Tusher VG, Tibshirani R, and Chu G (2001). Signiﬁcance analysis of microarrays applied to the ionizing radiation response. In Proceedings of the National Academy of Sciences, 98, 5116- 5121. DOI
18	Smyth GK (2004). Linear models and empirical Bayes methods for assessing diﬀerential expression in microarray experiments, Statistical Applications in Genetics and Molecular Biology, 3, 3.
19	Storey JD (2003). The positive false discovery rate: a Bayesian interpretation and the q-value, Annals of Statistics, 31, 2013-2035. DOI
20	Joober R, Benkelfat C, Toulouse A, et al. (1999). Analysis of 14 CAG repeat-containing genes in schizophrenia, American Journal of Medical Genetics (Neuropsychiatric Genetics), 88, 694-699. DOI
21	Yu W, Chang YCI, and Park E (2014). A modiﬁed area under the ROC curve and its application to marker selection and classiﬁcation, Journal of the Korean Statistical Society, 43, 161-175. DOI
22	Yu WB, Park E, and Chang YCI (2015). Comparison of paired ROC curves through a two-stage test, Journal of Biopharmaceutical Statistics, 25, 881-902. DOI