[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5351/CKSS.2006.13.1.113

Use of Factor Analyzer Normal Mixture Model with Mean Pattern Modeling on Clustering Genes

Kim Seung-Gu (Department of Applied Statistics, SangJi University)

Publication Information

Communications for Statistical Applications and Methods / v.13, no.1, 2006 , pp. 113-123 More about this Journal

Abstract

Normal mixture model(NMM) frequently used to cluster genes on microarray gene expression data. In this paper some of component means of NMM are modelled by a linear regression model so that its design matrix presents the pattern between sample classes in microarray matrix. This modelling for the component means by given design matrices certainly has an advantage that we can lead the clusters that are previously designed. However, it suffers from 'overfitting' problem because in practice genes often are highly dimensional. This problem also arises when the NMM restricted by the linear model for component-means is fitted. To cope with this problem, in this paper, the use of the factor analyzer NMM restricted by linear model is proposed to cluster genes. Also several design matrices which are useful for clustering genes are provided.

Keywords

Clustering genes; Design matrix; Factor analyzer normal mixture model;

Citations & Related Records

Reference

1	Francesco and Chiaramonte (2001). Analyzing Gene Expression Data From Microarrays: A Mixture-Based Approach. ENAR 2001 Spring Meeting, 25-28 March 2001, Charlotte, North Carolina, USA
2	Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybrra, S., Mack, D., and Levine, A.J. (1999). Broad patterns of gene expression revealed by clustering analysis of tomor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences USA 96, 6745-6750
3	Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., and Yakhimi, Z. (2000). Tissue classification with gene expression profiles. Journal of Computational Biology, Vol. 7, 559-584 DOI ScienceOn
4	McLachlan, G.J., and Peel, D. (2000). Finite Mixture Models. New York: Wiely
5	McLachlan, G.J., Bean, R.W., and Peel, D. (2002). A mixture model-based approach to the clustering of micorarray expression data. Bioinformatics, Vol. 18, 413-422 DOI ScienceOn
6	McLachlan, G.J., Do, K-A., Ambroise, C. (2004). Analyzing Microarray Gene Expression Data, Wiely and Sons
7	Meng, X.L., and van Dyk (1997). The EM algorithm-an old folk song sung to a fast new tune (with discussion). Journal of the Royal Statistical Society, Series B, Vol. 59, 511-567 DOI
8	Segal, E. Wang, H., and Koller, D. (2003). Discovering molercular pathways from protein interaction and gene expression data. Bioinformatics, Vol. 19(Suppl. 1), i264-i272 DOI ScienceOn

1	Detection of Differentially Expressed Genes by Clustering Genes Using Class-Wise Averaged Data in Microarray Data / [Kim, Seung-Gu;] / Communications for Statistical Applications and Methods
2	Normal Mixture Model with General Linear Regressive Restriction: Applied to Microarray Gene Clustering / [Kim, Seung-Gu;] / Communications for Statistical Applications and Methods
3	Variable Selection in Normal Mixture Model Based Clustering under Heteroscedasticity / [Kim, Seung-Gu;] / The Korean Journal of Applied Statistics