Browse > Article

Removing Non-informative Features by Robust Feature Wrapping Method for Microarray Gene Expression Data  

Lee, Jae-Sung (중앙대학교 컴퓨터공학과)
Kim, Dae-Won (중앙대학교 컴퓨터공학과)
Abstract
Due to the high dimensional problem, typically machine learning algorithms have relied on feature selection techniques in order to perform effective classification in microarray gene expression datasets. However, the large number of features compared to the number of samples makes the task of feature selection computationally inprohibitive and prone to errors. One of traditional feature selection approach was feature filtering; measuring one gene per one step. Then feature filtering was an univariate approach that cannot validate multivariate correlations. In this paper, we proposed a function for measuring both class separability and correlations. With this approach, we solved the problem related to feature filtering approach.
Keywords
Bioinformatics; HCA; Genetic algorithm; Feature selection; Correlation coefficient;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Peng H.C., Long, F., Ding, C., "Feature selection based on mutual information: criteria of max- dependency, max-relevance, and min-redundancy," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.27, pp. 1226-1238, 2005   DOI
2 Cianluca B., "A Blocking Startegy to Improve Gene Selection for Classification of Gene Expression Data," IEEE/ACM Trans. Computational Biology and Bioinformatics, pp. 293-300, 2007
3 Yudi Pawitan, Karuturi R. Krishna Murthy, Stefan Michiels, Alexander Ploner, "Bias in the estimation of false discovery rate in microarray studies," Bioinformatics, Vol.21, p. 3865, 2005   DOI   ScienceOn
4 Stephen Erickson, Hierarchical empirical Bayes analysis of genomic microarrays, University of California, Los Angeles, AAT 3247476, 2006
5 Dan Nettleton, "A Discussion of Statistical Methods for Design and Analysis of Microarray Experiments for Plant Scientists," Plant Cell, Vol.18, pp. 2112-2121, 2006   DOI   ScienceOn
6 David P. Kreil, Roslin R Russell, "There is no silver bullet - a guide to low-level data transforms and normalisation methods for microarray data," Briefings in Bioinformatics, Vol.6, pp. 86-97, 2005   DOI   ScienceOn
7 Miin-Shen, Kuo-Lung Wu, "A Similarity-Based Robust Clustering Method," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.26, pp. 434-448, 2004   DOI   ScienceOn
8 Kevin Dobbin, Richard Simon, "Sample size determination in microarray experiments for class comparison and prognostic classification," Biostatistics, Vol.6, p. 27, 2005   DOI   ScienceOn
9 Yvan Saeys, Iñaki Inza, Pedro Larrañaga, "A review of feature selection techniques in bioinformatics," Bioinformatics, Vol.23, pp. 2507-2517, 2007   DOI   ScienceOn
10 Carla S. Möller-Levet, Catharine M. West, Crispin J. Miller, "Exploiting sample variability to enhance multivariate analysis of microarray data," Bioinformatics, Vol.23, pp. 2733-2740, 2007   DOI   ScienceOn
11 Guo Yu, Statistical issues in microarry data analysis: Array-to-array normalization, Empirical Bayes batch effect adjustment, and Pearson's correlation coefficient in the context of replicated experiments, Harvard University, AAT 3217745, 2006
12 Danh V. et al., "Tumor classification by partial least squares using microarray gene expression data," Bioinformatics, Vol.18, No. 1, pp. 39-50, 2001   DOI
13 Seo Young Kim, Jae Won Lee, In Suk Sohn, "Comparison of various statistical methods for identifying differential gene expression in replicated microarray data," Statistical Methods in Medical Research, Vol.15, p. 3, 2006   DOI   ScienceOn
14 T. R. Golub et al., "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, Vol.286, pp. 531-537, 1999   DOI
15 Ian A. Wood, Peter M. Visscher, Kerrie L. Mengersen, "Classification based upon gene expression data: bias and precision of error rates," Bioinformatics, Vol.23, pp. 1363-1370, 2007   DOI   ScienceOn