Browse > Article
http://dx.doi.org/10.5351/CKSS.2006.13.1.113

Use of Factor Analyzer Normal Mixture Model with Mean Pattern Modeling on Clustering Genes  

Kim Seung-Gu (Department of Applied Statistics, SangJi University)
Publication Information
Communications for Statistical Applications and Methods / v.13, no.1, 2006 , pp. 113-123 More about this Journal
Abstract
Normal mixture model(NMM) frequently used to cluster genes on microarray gene expression data. In this paper some of component means of NMM are modelled by a linear regression model so that its design matrix presents the pattern between sample classes in microarray matrix. This modelling for the component means by given design matrices certainly has an advantage that we can lead the clusters that are previously designed. However, it suffers from 'overfitting' problem because in practice genes often are highly dimensional. This problem also arises when the NMM restricted by the linear model for component-means is fitted. To cope with this problem, in this paper, the use of the factor analyzer NMM restricted by linear model is proposed to cluster genes. Also several design matrices which are useful for clustering genes are provided.
Keywords
Clustering genes; Design matrix; Factor analyzer normal mixture model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Francesco and Chiaramonte (2001). Analyzing Gene Expression Data From Microarrays: A Mixture-Based Approach. ENAR 2001 Spring Meeting, 25-28 March 2001, Charlotte, North Carolina, USA
2 Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybrra, S., Mack, D., and Levine, A.J. (1999). Broad patterns of gene expression revealed by clustering analysis of tomor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences USA 96, 6745-6750
3 Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., and Yakhimi, Z. (2000). Tissue classification with gene expression profiles. Journal of Computational Biology, Vol. 7, 559-584   DOI   ScienceOn
4 McLachlan, G.J., and Peel, D. (2000). Finite Mixture Models. New York: Wiely
5 McLachlan, G.J., Bean, R.W., and Peel, D. (2002). A mixture model-based approach to the clustering of micorarray expression data. Bioinformatics, Vol. 18, 413-422   DOI   ScienceOn
6 McLachlan, G.J., Do, K-A., Ambroise, C. (2004). Analyzing Microarray Gene Expression Data, Wiely and Sons
7 Meng, X.L., and van Dyk (1997). The EM algorithm-an old folk song sung to a fast new tune (with discussion). Journal of the Royal Statistical Society, Series B, Vol. 59, 511-567   DOI
8 Segal, E. Wang, H., and Koller, D. (2003). Discovering molercular pathways from protein interaction and gene expression data. Bioinformatics, Vol. 19(Suppl. 1), i264-i272   DOI   ScienceOn