Browse > Article

Classifying Cancer Using Partially Correlated Genes Selected by Forward Selection Method  

유시호 (연세대학교 컴퓨터과학과)
조성배 (연세대학교 컴퓨터과학과)
Publication Information
Abstract
Gene expression profile is numerical data of gene expression level from organism measured on the microarray. Generally, each specific tissue indicates different expression levels in related genes, so that we can classify cancer with gene expression profile. Because not all the genes are related to classification, it is needed to select related genes that is called feature selection. This paper proposes a new gene selection method using forward selection method in regression analysis. This method reduces redundant information in the selected genes to have more efficient classification. We used k-nearest neighbor as a classifier and tested with colon cancer dataset. The results are compared with Pearson's coefficient and Spearman's coefficient methods and the proposed method showed better performance. It showed 90.3% accuracy in classification. The method also successfully applied to lymphoma cancer dataset.
Keywords
gene expression profile; feature selection; forward selection method; classification; regression analysis;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 C. A. Harrington, C. Rosenow, and J. Retief, 'Monitoring gene expression using DNA microarrays,' Curr. Opin, Microbiol, vol. 3, no. 3, pp. 285-291, 2000   DOI   ScienceOn
2 S. B. Cho and J. W. Ryu, 'Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features,' Proc. of the IEEE, vol. 90, no. 11, pp. 1744-1753, 2002   DOI   ScienceOn
3 W. D. Shannon, M. A. Watson, A. Perry, and K. Rich, 'Mantel statistics to correlate gene expression levels from microarrays with clinical covariates,' Genetic Epidemiology, vol. 23, no. 1, pp 87-96, 2002   DOI   ScienceOn
4 M. P. S. Brown, W. N. Grundy, D. Lin, N. Cristianini, C. Sugnet, M. Ares, jr., and D. Haussler, 'Support vector machine classification of microarray gene expression data,' USCS-CRL-99-09, pp. 1-23, June 1999
5 R. J. Lipshutz, S. P. Fodor, T. R. Gingeras, and D. J. Lockhart, 'High density synthetic oligonucleotide arrays,' Nature Genetics, vol. 21, pp. 20-24, 1999   DOI   ScienceOn
6 J. Khan, J S. Wei, M. Ringner, L. H. saar, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C. R. Antonescu, C. Peterson, and P. S. Meltzer, 'Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,' Nature, vol. 7, no. 6, pp. 673-679, June 2001   DOI   ScienceOn
7 L. Li, C. R. Weinberg, T. A. Darden, and L. G. Pedersen, 'Gene selection for samlple classification based on gene expression data-Study of sensitivity to choice of parameters of the GAIKNN method,' Bioinformatics, vol. 17, no. 12, pp 1131-1142, 2001   ScienceOn
8 T. H. Bo and I. Jonassen, 'New feature subset selection procedures for classification of expression profiles,' Genome Biology, vol. 3, no. 4, research0017.1-0017.11, 2002
9 M. Xiong, L. Jin, W. Li, and E. Boerwinkle, 'Computational methods for gene expression-based tumor classification,' BioTechniques, vol. 29, no. 6, pp. 1264-1270, 2000
10 P. Tamayo, 'Interpreting patterns of gene expression with self-organizing map: Methods and application to hematopoietic differentiation,' Proc. of National Academy of Sciences, vol. 96, pp. 2907-2912, 1999   DOI
11 S. Dudoit, J. Fridlyand, and T. P. Speed, 'Comparison of discrimination methods for the classification of tumors using gene expression data,' Technical Report 576, Department of Statistics, University of California, Berkeley, 2000
12 J. Rawlings, 'Applied regression analysis,' Wadsworth Books, Belmont, CA, 1998
13 K. E. Lee, N. Sha, E. R. Dougherty, M. Vannucci, and B. K. Mallick, 'Gene selection: A bayesian variable selection approach,' Bioinformatics, vol. 19, no. 1, pp 90-97, 2003   DOI   ScienceOn