Browse > Article
http://dx.doi.org/10.9708/jksci.2010.15.12.197

Class prediction of an independent sample using a set of gene modules consisting of gene-pairs which were condition(Tumor, Normal) specific  

Jeong, Hyeon-Iee (가천의과학대학교 IT학과)
Yoon, Young-Mi (가천의과학대학교 정보공학부)
Abstract
Using a variety of data-mining methods on high-throughput cDNA microarray data, the level of gene expression in two different tissues can be compared, and DEG(Differentially Expressed Gene) genes in between normal cell and tumor cell can be detected. Diagnosis can be made with these genes, and also treatment strategy can be determined according to the cancer stages. Existing cancer classification methods using machine learning select the marker genes which are differential expressed in normal and tumor samples, and build a classifier using those marker genes. However, in addition to the differences in gene expression levels, the difference in gene-gene correlations between two conditions could be a good marker in disease diagnosis. In this study, we identify gene pairs with a big correlation difference in two sets of samples, build gene classification modules using these gene pairs. This cancer classification method using gene modules achieves higher accuracy than current methods. The implementing clinical kit can be considered since the number of genes in classification module is small. For future study, Authors plan to identify novel cancer-related genes with functionality analysis on the genes in a classification module through GO(Gene Ontology) enrichment validation, and to extend the classification module into gene regulatory networks.
Keywords
datamining; classification; knowledge-based datamining; microarray data classification;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Carla s. Moller-Levet, Catharine M. West, Crispin J. Miller, "Exploiting sample variability to enhance multivariate analysis of microarray data", BIOINFORMATICS, Vol.23, pp.2733-2740, 2007   DOI   ScienceOn
2 Singh D., Febbo P. G., Ross K., Jackson D. G., Manola J., Ladd C., "Gene expression correlates of clinical prostate cancer behavior", cancer cell, vol.1, pp.203-209, 2002   DOI   ScienceOn
3 Welsh J. B., Sapinoso L, M., Su A. I., Kern S. G., Wang-Rodriquez J., Moskaluk C. A., "Analysis of gene expression identifies candidate markers and pharmachological targets in prostate cancer", Cancer Research, Vol. 61, pp.5974-5978, 2001
4 김선, 김수진, 장병탁, "마이크로어레이 기반 miRNA 모듈 분석을 위한 하이퍼망 분류기법", 정보과학회 논문지: 소프트웨어 및 응용, 제 35권, 제 6호, 347-356쪽, 2008년. 6월.   과학기술학회마을
5 원홍희, 조성배, "암 분류를 위한 기계학습 분류기의 성능평가", 한국정보처리학회추계학술발표대회논문집 제9권 제2호, 2002년 11월.
6 Breiman L, "Random Forest, Machine Learning", 45, p.p. 5-32, 2001 .
7 윤태균, 이관수, "의료진단 및 중요검사 항목 결정 지원 시스템을 위한 랜덤 포레스트 알고리즘 적용", 전기학회논문지 제 57권, 제 6호, 1058-1062쪽, 2008년 6월.   과학기술학회마을
8 Ramon Diaz-Uriarte, Sara Alvarez de Andres, "Gene selection and classification of microarray data using random forest.", BMC Bioinformatics, p.p.1471-2105, 2006.
9 윤영미, 이종찬, 박상현, "두 단계 접근법을 통한 통합 마이크로어레이 데이터의 분류자 찾기", 정보과학회논문지: 데이터베이스 제34권, 제1호, 193-205쪽, 2007년 2월.   과학기술학회마을
10 서울대학교 통계학과 생물정보통계연구실, "마이크로어레이 자료의 통계적 분석", 자유아카데미, 21-33쪽, 2005년
11 Donald Geman, Christian d'Avignon, Daniel Q. Naiman, Raimond L. Winslow, "Classifying Gene Expression Profiles from Pairwise mRNA Comparisons.", Statistical Applications in Genetics and Molecular Biology, Vol.3, Issue.1, Article.19, 2004
12 Aik Choon Tan, Daniel Q. Naiman, LeiXu, Raimond L. Winslow and Donald Geman, "Simple decision rules for classifying human cancers from gene expression profiles.", BIOINFORMATICS, Vol.21, No.20, p.p. 3896-3904, 2005   DOI   ScienceOn
13 LaTulippe E., Satagopan J., Smith A., Scher H., Scardino P., Reuter V., "Comprehensive gene expression analysis of prostate cancer reveals distinct transcriptional programs associated with metastatic disease", Cancer Research, Vol.62, pp.4499-4506, 2002
14 Guo Yu, Statistical issues in microarray data analysis: Array-to-array normalization, Empirical Bayes batch effect adjustment.
15 Daniela Dunkler, Michael Schemper and Georg Heinze, "Gene selection in Microarray survival studies under possibly non-proportional hazards.", BIOINFORMATICS, Vol.26, No.6, p.p. 784-790, 2010.   DOI   ScienceOn
16 Yuhang Wang, Fillia S. Makedon, James C. Ford, and Justin Pearlman, "HykGene: a hybrid approach for selecting maker genes for phenotype classification using microarray gene expression data", BIOINFORMATICS, Vol.21, No.8, pp. 1530-1537, 2005.   DOI   ScienceOn
17 Junhee Seok, Amit Kaushal, Ronald W Davis, and Wenzhong Xiao, "Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships.", BMC Bioinformatics, 2010. 1.
18 안재균, 윤영미, 신은지, 박상현, "유전자 발현값 상관관계 분석을 통한 암분류자 생성방법.", 제32회 한국정보처리학회 추계학술대회 논문집 제16권 2호, 769-770쪽, 2009년 11월
19 Chunrong Cheng, Kui Shen, Chi Song, Jianhua Luo, George C. Tseng, "Ratio adjustment and calibration scheme for gene-wise normalization to enhance microarray inter-study prediction.", Bioinformatics, p.p.1655-1661, 2009.
20 Aleksey Fadeev, Oualid Missaoui, Hichem Frigui, "Ensemble Possibilistric k-NN for Functional Clustering of Gene Expression Profiles in Human Cancers Challenge.", icmla, 2009 International Conference on Machine Learning and Applications, p.p.439-442, 2009.
21 Rameswar Debnath, Takio Kurita, "An evolutionary approach for gene selection and classification of microarray data based on SVM error-bound theories.", ELSEVIER, BioSystems 100, p.p.39-46, 2010.   DOI   ScienceOn
22 홍진혁, 조성배, "나이브 베이스 분류기를 이용한 유전 발현 데이터기반 암 분류를 위한 순위기반 다중 클래스 유전자 선택", 정보과학회논문지: 시스템 및 이론, 제35권, 제8호, 372-377쪽, 2008년 8월.   과학기술학회마을
23 Gashier M., Giraud-Carrier C., Martinez T., "Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous.", Seventh International Conference on Machine Learning and Application, p.p.900-905, 2008.
24 Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Written ; The WEKA Data Mining Software, An Update, SIGKDO Explorations, Vol.11, Issue 1, 2009.
25 http:/ /www.hsl.creighton.edu/hsl/searching/Recall-Precision.html
26 박윤정, 박승수, "암 분류를 위한 분류기법의 성능비교", 한국 컴퓨터 종합 학술대회 논문집, Vol.33, No.1, 220-222쪽, 2006년.
27 Jorng-Tzong Horng, Li-Cheng Wu, Baw-Juine Liu, Jun-Li Kuo, Wen-Horng Kuo, Jin-Jian Zhang, "An expert system to classify microarray gene expression data using gene selection by decision tree.", Elsevier, Expert Systems with Applications, Vol.36, Issue.5, p.p.9072-9081, 2009.   DOI   ScienceOn
28 Elias Zintzaras, Axel Kowald, "Forest classification trees and forest support vector machines algorithms: Demonstration using microarray data.", ELSEVIER, Computers in Biology and Medicine 40, p.p.519-524, 2010.   DOI   ScienceOn