Gene Expression Data Analysis Using Seed Clustering

Shin Myoung;

Journal of the Institute of Electronics Engineers of Korea CI (전자공학회논문지CI)

Volume 42 Issue 1
/
Pages.1-7
/
2005
/
1229-6376(pISSN)

The Institute of Electronics and Information Engineers (대한전자공학회)

Gene Expression Data Analysis Using Seed Clustering

시드 클러스터링 방법에 의한 유전자 발현 데이터 분석

Shin Myoung (Bioinformatics Research Team, Electronics and Telecommunication Research Institute)

신미영 (한국전자통신연구원 바이오정보연구팀)

Published : 2005.01.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Cluster analysis of microarray data has been often used to find biologically relevant Broups of genes based on their expression levels. Since many functionally related genes tend to be co-expressed, by identifying groups of genes with similar expression profiles, the functionalities of unknown genes can be inferred from those of known genes in the same group. In this Paper we address a novel clustering approach, called seed clustering, and investigate its applicability for microarray data analysis. In the seed clustering method, seed genes are first extracted by computational analysis of their expression profiles and then clusters are generated by taking the seed genes as prototype vectors for target clusters. Since it has strong mathematical foundations, the seed clustering method produces the stable and consistent results in a systematic way. Also, our empirical results indicate that the automatically extracted seed genes are well representative of potential clusters hidden in the data, and that its performance is favorable compared to current approaches.

마이크로어레이 데이터의 클러스터 분석은 생물학적으로 연관성 있는 유전자 그룹을 찾기 위해 종종 사용되는 방법이다. 기능적으로 연관된 유전자들이 대개 유사한 발현 패턴을 나타내는 특징을 이용하여 유사한 발현 프로파일을 가진 유전자 그룹을 찾아냄으로써 알려지지 않은 유전자들의 기능을 같은 그룹에 속한 다른 유전자로부터 유추할 수 있기 때문이다. 본 논문에서는 클러스터 분석을 위해 시드 클러스터링 알고리즘을 새로이 제안하고, 이 방법을 마이크로어레이 데이터 분석에 적용해본다. 시드 클러스터링 방법은 주어진 데이터를 계산적으로 분석하여 시드 패턴을 자동 추출하고, 이러한 시드 패턴을 목적 클러스터의 프로토타입 벡터로서 간주하여 클러스터를 생성하는 방법이다. 이러한 시드 클러스터링 방법은 수학적 원리에 기초하고 있기 때문에, 매우 체계적인 방법으로 안정적이며 일관성 있는 클러스터링 결과를 생성할 수 있다. 또한, 실제 마이크로어레이 데이터 분석에 적용해본 결과 데이터에 내재된 각 클러스터를 대표하는 시드 패턴을 매우 효과적으로 자동 추출할 수 있었으며, 클러스터링 결과 또한 타 방법에 비해 다소 우월한 경향을 나타내었다.

Keywords

References

M.E. Eisen, P.T. Spellman, P.O. Brown and D. Botstein, Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci., Vol. 95, pp.14863-14868, 1998 https://doi.org/10.1073/pnas.95.25.14863
S. Tavazoie, J.D. Hughes, M.J. Campbell, R.J. Cho and G.M. Church, Systematic determination of genetic network architecture, Nature Genetics, Vol. 22, pp. 281-285, 1999 https://doi.org/10.1038/10343
P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E.S. Lander and T.R. Golub, Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation, Proc. Natl. Acad. Sci., Vol. 96, pp. 2007-2912, 1999 https://doi.org/10.1073/pnas.96.6.2907
Golub, G.H. and Van Loan, C.F., Matrix Computation (3rd edition), The Johns Hopkins University Press, pp. 500-595, 1996
R. J. Cho, M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart and R. W. Davis, 'A genome-wide transcriptional analysis of the mitotic cell cycle,' Molecular Cell, vol. 2, pp. 65-73, 1998 https://doi.org/10.1016/S1097-2765(00)80114-8
http://staff.washington.edu/kayee/cluster/
K. Y. Yeung and W.L. Ruzzo, Principle component analysis for clustering gene expression data, Bioinformatics, Vol. 17, no. 9, pp.763-774, 2001 https://doi.org/10.1093/bioinformatics/17.9.763
K. Y. Yeung, et al., 'Validating clustering for gene expression data,' Bioinformatics, vol. 17, no. 4, pp. 309-318, 2001 https://doi.org/10.1093/bioinformatics/17.4.309
오쯔까 기치비, 아비꼬 요시미쯔, 비주얼 생화학, 분자 생물학, 해돋이, pp. 94-95, 2000
Sus. Datta and Som. Datta, Comparisons and validation of statistical clustering techniques for microarray gene expression data, Bioinformatics, Vol. 19, no. 9, pp.459-466, 2003 https://doi.org/10.1093/bioinformatics/btg025
D. Horn and I. Axel, Novel clustering algorithm for microarray expression data in a truncated SVD space, Bioinformatics, Vol. 19, no. 9, pp. 1110-1115, 2003 https://doi.org/10.1093/bioinformatics/btg053
H. Toh and H. Horimoto, Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling, Bioinformatics, Vol. 18, no. 2, pp. 287-297, 2002 https://doi.org/10.1093/bioinformatics/18.2.287
C. Ding, X. He, H. Zha, and H.D. Simon, Adaptive dimension reduction for clustering high dimensional data, Proceedings of 2nd IEEE International Conference on Data Mining, 2002 https://doi.org/10.1109/ICDM.2002.1183897

Journal of the Institute of Electronics Engineers of Korea CI (전자공학회논문지CI)

Gene Expression Data Analysis Using Seed Clustering

시드 클러스터링 방법에 의한 유전자 발현 데이터 분석

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)