Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2009.16-D.3.327

Macroscopic Biclustering of Gene Expression Data  

Ahn, Jae-Gyoon (연세대학교 컴퓨터과학과)
Yoon, Young-Mi (가천의과학대학교)
Park, Sang-Hyun (연세대학교 컴퓨터과학과)
Abstract
A microarray dataset is 2-dimensional dataset with a set of genes and a set of conditions. A bicluster is a subset of genes that show similar behavior within a subset of conditions. Genes that show similar behavior can be considered to have same cellular functions. Thus, biclustering algorithm is a useful tool to uncover groups of genes involved in the same cellular process and groups of conditions which take place in this process. We are proposing a polynomial time algorithm to identify functionally highly correlated biclusters. Our algorithm identifies 1) the gene set that has hidden patterns even if the level of noise is high, 2) the multiple, possibly overlapped, and diverse gene sets, 3) gene sets whose functional association is strongly high, and 4) deterministic biclustering results. We validated the level of functional association of our method, and compared with current methods using GO.
Keywords
Data Mining; Biclustering; Gene Expression Data Analysis; Microarray Analysis; Noise;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Bhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler, 'A systematic comparison and evaluation of biclustering methods for gene expression data,' Bioinformatics, Vol.22, No.9, pp.1122-1129, 2006   DOI   ScienceOn
2 S. Tavazoie, J. D. Hughes, M. J. Campbell, R. J. Cho, and G. M. Church, 'Systematic determination of genetic network architecture,' Nature Genetics, Vol.22, pp.281-285, 1999   DOI   ScienceOn
3 J. Ihmels, S. Bergmann and N. Barkai, 'Defining transcription modules using large-scale gene expression data,' Bioinformatics, Vol.20, No.13, pp.1993-2003, 2004   DOI   ScienceOn
4 S. C. Madeira and A. L. Oliveira, 'Biclustering Algorithms for Biological Data Analysis: A Survey,' IEEE/ACM Trans. Computational Biology and Bioinformatics, Vol.1, No.1, pp. 24-45, 2004   DOI   ScienceOn
5 W. Liu and L. Chen, 'A Fast Algorithm for Gene Expressing Data Biclustering,' International Journal of Intelligent Information Technology Application, Vol.1, issue1, pp.30-36, 2008   DOI
6 H. Wang, W. Wang, J. Yang and P. S. Yu, 'Clustering by Pattern Similarity in Large Data Sets,' in Proc. ACM SIGMOD Int'l. Conf. Management of Data, pp.394-405, 2002   DOI
7 L. Zhao and M. J. Zaki, 'triCluster: An Effective Algorithm for Mining Coherent Clusters in 3D Microarray Data,' in Proc. ACM SIGMOD Int'l. Conf. on Management of data, pp.694-705, 2005   DOI
8 Y. Cheng and G.M. Church, 'Biclustering of Expression Data,' in Proc. 8th Int'l Conf. Intelligent Systems for Molecular Biology, pp.93-103, 2000
9 X. Xu, Y. Lu, A. K. H. Tung and W. Wang, 'Mining Shiftingand-Scaling Co-Regulation Pattern on Gene Expression Profiles,' in Proc. 22nd IEEE Int'l. Conf. on Data Engineering, pp.89-99, 2006   DOI
10 A. P. Gasch, P. T. Spellman, C. M. Kao, O. Carmel-Harel, M. B. Eisen, G. Storz, D. Botstein and P. O. Brown, 'Genomic expression programs in the response of yeast cells to environmental changes,' Molecular Biology of the Cell, Vol.11, pp.4241-57, 2000   DOI
11 X. Liu and L. Wang, 'Computing the maximum similarity bi-clusters of gene expression data,' Bioinformatics, Vol.18, No.1, pp.50-56, 2007   DOI   ScienceOn
12 J. Liu and W. Wang, 'Op-cluster: Clustering by tendency in high dimensional space,' in Proc. IEEE Int'l. Conf. on Data Mining, pp.187-194, 2003
13 B. J. Gao, O. L. Griffith, M. Ester, and S. J. M. Jones, 'Discovering significant OPSM subspace clusters in massive gene expression data,' in Proc. 12th ACM SIGKDD pp.922- 928, 2006   DOI
14 Y. Zhao, G. Wang, Y. Yin and G. Yu, 'Mining Positive and Negative Co-regulation Patterns from Microarray Data,' in Proc. 6th IEEE Symposium on BioInformatics and Bio- Engineering, pp.86-93, 2006
15 G. F. Berriz, O. D. King, B. Bryant, C. Sander and F. P. Roth, 'Characterizing gene sets with FuncAssociate,' Bioinformatics, Vol.19, No.18, pp.2502-2504, 2003   DOI   ScienceOn
16 S. Barkow, S. Bleuler, A. Prelic, P. Zimmermann and E. Zitzler, 'BicAT: a biclustering analysis toolbox,' Bioinformatics, Vol.22, No.10, pp.1282-1283, 2006   DOI   ScienceOn
17 A. Tanay, R. Sharan and R. Shamir, 'Discovering statistically significant biclusters in gene expression data,' Bioinformatics, Vol.18, No.1, pp.136-144, 2002   DOI   ScienceOn
18 A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini, 'Discovering local structure in gene expression data: The order-preserving submatrix problem,' in Proc. 6th Int'l Conf. Computational Biology, pp.49-57, 2002   DOI
19 T. M. Murali and S. Kasif, 'Extracting conserved gene expression motifs from gene expression data,' Pac. Symp. Biocomput., 8, 77-88, 2003   DOI
20 J. Han, M. Kamber, Data Mining: Concepts and Techniques, second ed., Morgan Kaufman, San Francisco, CA, 2006