Browse > Article
http://dx.doi.org/10.5351/KJAS.2011.24.6.1103

Performance Comparison of Classication Methods with the Combinations of the Imputation and Gene Selection Methods  

Kim, Dong-Uk (Department of Statistics, Sungkyunkwan University)
Nam, Jin-Hyun (Department of Statistics, Sungkyunkwan University)
Hong, Kyung-Ha (Korea Institute for Defense Analyses)
Publication Information
The Korean Journal of Applied Statistics / v.24, no.6, 2011 , pp. 1103-1113 More about this Journal
Abstract
Gene expression data is obtained through many stages of an experiment and errors produced during the process may cause missing values. Due to the distinctness of the data so called 'small n large p', genes have to be selected for statistical analysis, like classification analysis. For this reason, imputation and gene selection are important in a microarray data analysis. In the literature, imputation, gene selection and classification analysis have been studied respectively. However, imputation, gene selection and classification analysis are sequential processing. For this aspect, we compare the performance of classification methods after imputation and gene selection methods are applied to microarray data. Numerical simulations are carried out to evaluate the classification methods that use various combinations of the imputation and gene selection methods.
Keywords
Gene expression; imputation; gene selection; classication;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Lee, J. W., Lee, J. B., Park, M. and Song, S. H. (2005). An extensive comparison of recent classification tools applied to microarray data, Computational Statistics and Data Analysis, 48, 869-885.   DOI   ScienceOn
2 Liew, A. W., Law, N. and Yan, H. (2010). Missing value imputation for gene expression data: computational techniques to recover missing data from available information, Briefings in Bioinformatics, 12, 498-513.
3 Liu, X., Krishnan, A. and Mondry, A. (2005). An entropy-based gene selection method for cancer classification using microarray data, BMC Bioinformatics, 6, 76.   DOI   ScienceOn
4 Nguyen, D. V., Wang, N. and Carroll, R. J. (2004). Evaluation of missing value estimation for microarray data, Journal of Data Science, 2, 347-370.
5 Oba, S., Sato, M., Takemasa, I., Monden, M., Matsubara, K. and Ishii, S. (2003). A Bayesian missing value estimation method for gene expression profile data, Bioinformatics, 19, 2088-2096.   DOI   ScienceOn
6 Scheel, I., Aldrin, M., Glad, I. K., Sorum, R., Lyng, H. and Frigessi, A. (2005). The influence of missing value imputation on detection of differentially expressed genes from microarray data, Bioinformatics, 21, 4272-4279.   DOI   ScienceOn
7 Tibshirani, R., Hastie, T., Narasimhan, B. and Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression, Proceedings of the National Academy of Sciences, 99, 6567-6572.   DOI   ScienceOn
8 Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D. and Altman, R. B. (2001). Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 520-525.   DOI   ScienceOn
9 Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C., Lossos, I. S., Rosenwald, A., Bordrick, J. C., Sabet, H., Tran, T., Yu, X., Powell, J. I., Yang, L., Marti, G. E., Moore, T. Jr. J. H., Lu, L., Lwis, D. B., Tibshirani, R., Sherlock, G., Chan, W. C., Greiner, T. C., Weisenburger, D. D., Armitage, J. O., Warnke, R., Levy, R., Wilson, W., Grever, M. R., Byrd, J. C., Botstein, D., Brouwn, P. O. and Staudt, L. M. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Nature, 403, 503-511.   DOI   ScienceOn
10 Dudoit, S., Fridlyand, J. and Speed, T. P. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association, 97, 77-87.   DOI   ScienceOn
11 Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, K. P., Coller, H., Loh, M., Downing, J. R., Caligiuri, M. A., Bloom eld, C. D. and Lander, E. S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring, Science, 286, 531-537.   DOI   ScienceOn
12 Kim, H., Golub, G. H. and Park, H. (2005). Missing value estimation for DNA microarray gene expression data: Local least squares imputation, Bioinformatics, 21, 187-198.   DOI   ScienceOn
13 Guyon, I., Weston, J. and Barnhill, S. (2002). Gene selection for cancer classification using support vector machines, Machine Learning, 46, 389-422.   DOI
14 Khan, J., Wei, J., Ringner, M., Saal, L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C. and Meltzer, P. S. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine, 7, 673-679.   DOI   ScienceOn