DOI QR코드

DOI QR Code

Cluster Analysis of Incomplete Microarray Data with Fuzzy Clustering

  • Kim, Dae-Won (School of Computer Science and Engineering, Chung-Ang University)
  • Published : 2007.06.30

Abstract

In this paper, we present a method for clustering incomplete Microarray data using alternating optimization in which a prior imputation method is not required. To reduce the influence of imputation in preprocessing, we take an alternative optimization approach to find better estimates during iterative clustering process. This method improves the estimates of missing values by exploiting the cluster Information such as cluster centroids and all available non-missing values in each iteration. The clustering results of the proposed method are more significantly relevant to the biological gene annotations than those of other methods, indicating its effectiveness and potential for clustering incomplete gene expression data.

Keywords

References

  1. M. Eisen, P.T. Spellman, P.O. Brown, et al., Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci. USA 9S (1998) 14863-14868
  2. P. Tamayo, D. Slonim, J. Mesirov, et al., Interpreting patters of gene expression with self-organizing maps - methods and application to hematopoietic differentiation, Proc. Natl. Acad. Sci. USA 96 (1999) 2907-2912 https://doi.org/10.1073/pnas.96.6.2907
  3. S. Tavazoie, J.D. Hughes, M.J. Campbell, et al., Systematic determination of genetic network architecture, Nat. Genet. 22 (1999) 281-285 https://doi.org/10.1038/10343
  4. Y. Xu, V. Olman, D. Xu, Clustering gene expression data using a graph-theoretic approach - an application of minimum spanning trees, Bioinformatics 17 (2001) 309-318 https://doi.org/10.1093/bioinformatics/17.4.309
  5. R. Steuer, J. Kurths, C.O. Daub, et al., The mutual information: Detecting and evaluating dependencies between variables, Bioinformatics 18 (2002) 8231-8240
  6. D. Dembele, P. Kastner, Fuzzy c-means method for clustering microarray data, Bioinformatics 19 (2003) 973-980 https://doi.org/10.1093/bioinformatics/btg119
  7. I.S. Dhilon, E.M. Marcotte, U. Roshan, Diametrical clustering for identifying anti-correlated gene clusters, Bioinformatics 19 (2003) 1612-1619 https://doi.org/10.1093/bioinformatics/btg209
  8. D. Hom, I.Axel, Novel clustering algorithm for microarray expression data in a truncated SVD space, Bioinforrnatics 19 (2003) 1110-1115 https://doi.org/10.1093/bioinformatics/btg053
  9. S. Dudoit, J. Fridlyand, Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19 (2003) 1090-1099 https://doi.org/10.1093/bioinformatics/btg038
  10. R. Sharan, A. Maron-Katz, R. Shamir, CLICK and EXPANDER: a system for clustering and visualizing gene expression data, Bioinformatics 19 (2003) 1787-1799 https://doi.org/10.1093/bioinformatics/btg232
  11. O. Troyanskaya, M. Cantor, G. Sherlock, et al., Missing value estimation methods for DNA microarrays, Bioinformatics 17 (2001) 520-525 https://doi.org/10.1093/bioinformatics/17.6.520
  12. T.H. Bo, B. Dysvik, I. Jonassen, LSimpute: accurate estimation of missing values in microarray data with least square methods, Nucleic Acids Research 32 (2004) e34 https://doi.org/10.1093/nar/gnh026
  13. M. Ouyang, W.J. Welsh, P. Georgopoulos, Guassian mixture clustering and imputation of microarray data, Bioinformatics 20 (2004) 917-923 https://doi.org/10.1093/bioinformatics/bth007
  14. A.A. Aiizadeh, M.B. Eisen, R.E. David et al., Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling, Nature 403 (2000) 503-511 https://doi.org/10.1038/35000501
  15. M.E. Fuschik, Methods for Knowledge Discovery in Microarray Data, Ph.D. Thesis, University of Otago (2003)
  16. R.J. Hathaway, J.C. Bezdek, Fuzzy c-means clustering of incomplete data, IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics 31 (2001) 735-744 https://doi.org/10.1109/3477.956035
  17. S. Chu, J. DeRish, M. Eisen, et al., The transcriptional program of sporulation in budding yeast, Science 282 (1998) 699-705 https://doi.org/10.1126/science.282.5389.699
  18. R.J.Cho, M.J. Campbell, E.A. Winzeler, et al., A genome-wide transcriptional analysis of the mitotic cell cycle, Mol. Cell 2 (1998) 65-73 https://doi.org/10.1016/S1097-2765(00)80114-8
  19. F.D. Gibbons, F.P. Roth, Judging the quality of gene expression-based clustering methods using gene annotation, Genome Res. 12 (2002) 1574-1581 https://doi.org/10.1101/gr.397002
  20. M. Ashburner, C.A. Ball, J.A. Blake, et al., Gene Ontology: tool for the unification of biology, Nat. Genet. 25 (2000) 25-29 https://doi.org/10.1038/75556
  21. L. Issel-Tarver, KR. Christie, K Dolinski, et al., Saccharomyces genome database. Methods Enzymol 350 (2002) 329-346
  22. K. Yeung, D.R. Haynor, W.L. Ruzzo, Validating clustering for gene expression data, Bioinformatics 17 (2001) 309-318 https://doi.org/10.1093/bioinformatics/17.4.309