Browse > Article

Searching for Optimal Ensemble of Feature-classifier Pairs in Gene Expression Profile using Genetic Algorithm  

박찬호 (연세대학교 컴퓨터공학과)
조성배 (연세대학교 컴퓨터공학과)
Abstract
Gene expression profile is numerical data of gene expression level from organism, measured on the microarray. Generally, each specific tissue indicates different expression levels in related genes, so that we can classify disease with gene expression profile. Because all genes are not related to disease, it is needed to select related genes that is called feature selection, and it is needed to classify selected genes properly. This paper Proposes GA based method for searching optimal ensemble of feature-classifier pairs that are composed with seven feature selection methods based on correlation, similarity, and information theory, and six representative classifiers. In experimental results with leave-one-out cross validation on two gene expression Profiles related to cancers, we can find ensembles that produce much superior to all individual feature-classifier fairs for Lymphoma dataset and Colon dataset.
Keywords
GA; gene expression profile; feature selection; classifier; ensemble; GA;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. R. Golub, et al., 'Molecular classification of cancer class discovery and class prediction by gene-expression monitoring,' Science, vol. 286, no. 15, pp. 531-537, October 1999   DOI   ScienceOn
2 L. J. v. Veer, et al., 'Gene expression profiling predicts clinical outcome of breast cancer,' Nature, vol. 415, no. 31, pp. 530-536, January 2002   DOI
3 L. Li, et al., 'Gene selection for sample classification based on gene expression data: Study of sensitivity to choice of parameters of the GNKNN method,' Bioinformatics, vol. 17, no. 12, pp. 1131-1142, June 2001   DOI   ScienceOn
4 J. Khan, et al., 'Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,' Nature, vol. 7, no. 6, pp, 673-679, June 2001   DOI   ScienceOn
5 M. P. S. Brown, et al., 'Support vector machine classification of microarray gene expression data,' USCS-CRL-99-09, pp. 1-23, June 1999
6 S. Fuhrman, et al., 'The application of Shannon entropy in the identification of putative drug targets,' BioSystems, vol. 55. pp. 5-14, 2000   DOI   ScienceOn
7 D. Thieffry, et al., 'Qualitative analysis of gene networks,' Pacific Symposium on Biocomputing, vol. 3, pp. 66-76, 1998
8 Y. H. Yang, et al., 'Normalization for cDNA microarray data: A robust composite method addressing single and multiple slide systematic variation,' Nucleic Acids Research, vol. 30, no. 4, e15, pp 1-10, 2002   DOI   ScienceOn
9 T. S. Furey, et al., 'Support vector machine classification and validation of cancer tissue samples using microarray expression data,' Bioinformatics, vol. 16, no. 10, pp. 906-914, 2000   DOI   ScienceOn
10 H.-D. Kim and S.-B. Cho, 'Genetic optimization of structure-adaptive self-organizing map for efficient classification,' Proc. of International Conference on Soft Computing, pp. 34-39, October 2000
11 A. A. Alizadeh, et al., 'Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling,' Nature, vol. 403, pp. 503-511, February 2000   DOI   ScienceOn
12 S.-B. Cho, and J.-W. Ryu, 'Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features,' Proc. of the IEEE, vol. 90, no. 11, pp. 1744-1753, 2002   DOI   ScienceOn
13 J. Quackenbush, 'Computational analysis of microarray data,' Nature Reviews Genetics, vol. 2, pp, 418-427, June 2001   DOI   ScienceOn
14 T. M. Mitchell, Machine Learning, Carnegie Mellon University, 1997
15 R. P. Lippmann, 'Pattern classification using neural networks,' IEEE Communications Magazine, pp. 47-64, November, 1989   DOI   ScienceOn
16 R. O. Duda, et al., Pattern Classification, 2nd Ed., Wiley Interscience, 2001
17 A. Ben-Dor, et, al., 'Tissue classification with gene expression profiles,' Journal of Computational Biology, vol. 7, pp. 559-584, 2000   DOI   ScienceOn
18 D. V. Nguyen, et al., 'Tumor classification by partial least squares using microarray gene expression data,' Bioinformatics, vol. 18, no. 1, pp. 39-50, 2002   DOI   ScienceOn
19 S. Dudoit, et al., 'Comparison of discrimination methods for the classification of tumors using gene expression data,' Technical Report 576, Department of Statistics, University of California, Berkeley, 2000
20 Y. Xu, et al., 'Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett's esophagus and csophageal cancer,' Cancer Research, vol. 62, pp. 3493-3497, 2002
21 U. Alan et al., 'Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,' Proc. Natl. Acad Sci. USA. vol. 96, pp. 6745-6750, June 1999   DOI