DOI QR코드

DOI QR Code

Detection of Differentially Expressed Genes by Clustering Genes Using Class-Wise Averaged Data in Microarray Data

  • Kim, Seung-Gu (Department of Data & Information, Sangji University)
  • Published : 2007.12.31

Abstract

A normal mixture model with which dependence between classes is incorporated is proposed in order to detect differentially expressed genes. Gene clustering approaches suffer from the high dimensional column of microarray expression data matrix which leads to the over-fit problem. Various methods are proposed to solve the problem. In this paper, use of simple averaging data within each class is proposed to overcome the various problems due to high dimensionality when the normal mixture model is fitted. Some experiments through simulated data set and real data set show its availability in actuality.

Keywords

References

  1. 김승구 (2007). Use of factor analyzer normal mixture model with mean pattern modeling on clustering genes. 한국통계학회논문집, 13, 113-123 https://doi.org/10.5351/CKSS.2006.13.1.113
  2. Allison, D. B. Gadbury, G. L., Heo, M., Fercndez, J. R., Lee, C.-K., Prolla, T. A. and Weindruch, R. (2002). A mixture model approach for the analysis of microarray gene expression data. Computational Statistics and Data Analysis, 39, 1-20 https://doi.org/10.1016/S0167-9473(01)00046-9
  3. Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. and Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tomor and normal colon tissues probed by oligonucleotide arrays. In Proceedings of the National Academy of Sciences of the United States of America, 96, 6745-6750
  4. Do, K.-A, Mueller, P. and Tang, F. (2005). A nonparametric Bayesian mixture model for gene expression. Applied Statistics, 54, 1-18
  5. Efron, B. and Tibshirani, R. (2002). Empirical Bayes methods and false discovery rates for microarrays. Genetic Epidemiology, 23, 70-86 https://doi.org/10.1002/gepi.1124
  6. Efron, B. (2004). Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. Journal of the American Statistical Association, 99, 96-104 https://doi.org/10.1198/016214504000000089
  7. He, Y., Pan, W. and Lin, J. (2006). Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data. Computational Statistics and Data Analysis, 51, 641-658 https://doi.org/10.1016/j.csda.2006.02.012
  8. McLachlan, J. L., Peel, D. and Bean, R. W. (2003). Modeling high-dimensional data by mixtures of factor analyzers. Computational Statistics & Data Analysis, 41, 379-388 https://doi.org/10.1016/S0167-9473(02)00183-4
  9. McLachlan, G. J., Bean, R. W. and Jones, L. B.-T. (2006). A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays. Bioinformatics, 22, 1608-1615 https://doi.org/10.1093/bioinformatics/btl148
  10. Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. In Proceedings of the National Academy of Sciences of the United States of America, 98, 5116-5121
  11. Pawitan, Y., Murthy, K. R. K., Michiels, S. and Ploner, A. (2005). Bias in the estimation of false discovery rate in microarray studies. Bioinformatics, 21, 3865-3872 https://doi.org/10.1093/bioinformatics/bti626