Browse > Article

FCAnalyzer: A Functional Clustering Analysis Tool for Predicted Transcription Regulatory Elements and Gene Ontology Terms  

Kim, Sang-Bae (Korean BioInformation Center, Korea Research Institute of Bioscience and Biotechnology)
Ryu, Gil-Mi (Center for Genome Science, National Institute of Health)
Kim, Young-Jin (Center for Genome Science, National Institute of Health)
Heo, Jee-Yeon (Center for Genome Science, National Institute of Health)
Park, Chan (Center for Genome Science, National Institute of Health)
Oh, Berm-Seok (Center for Genome Science, National Institute of Health)
Kim, Hyung-Lae (Center for Genome Science, National Institute of Health)
Kimm, Ku-Chan (Center for Genome Science, National Institute of Health)
Kim, Kyu-Won (College of Pharmacy, Seoul National University)
Kim, Young-Youl (Center for Genome Science, National Institute of Health)
Abstract
Numerous studies have reported that genes with similar expression patterns are co-regulated. From gene expression data, we have assumed that genes having similar expression pattern would share similar transcription factor binding sites (TFBSs). These function as the binding regions for transcription factors (TFs) and thereby regulate gene expression. In this context, various analysis tools have been developed. However, they have shortcomings in the combined analysis of expression patterns and significant TFBSs and in the functional analysis of target genes of significantly overrepresented putative regulators. In this study, we present a web-based A Functional Clustering Analysis Tool for Predicted Transcription Regulatory Elements and Gene Ontology Terms (FCAnalyzer). This system integrates microarray clustering data with similar expression patterns, and TFBS data in each cluster. FCAnalyzer is designed to perform two independent clustering procedures. The first process clusters gene expression profiles using the K-means clustering method, and the second process clusters predicted TFBSs in the upstream region of previously clustered genes using the hierarchical biclustering method for simultaneous grouping of genes and samples. This system offers retrieved information for predicted TFBSs in each cluster using $Match^{TM}$ in the TRANSFAC database. We used gene ontology term analysis for functional annotation of genes in the same cluster. We also provide the user with a combinatorial TFBS analysis of TFBS pairs. The enrichment of TFBS analysis and GO term analysis is statistically by the calculation of P values based on Fisher’s exact test, hypergeometric distribution and Bonferroni correction. FCAnalyzer is a web-based, user-friendly functional clustering analysis system that facilitates the transcriptional regulatory analysis of co-expressed genes. This system presents the analyses of clustered genes, significant TFBSs, significantly enriched TFBS combinations, their target genes and TFBS-TF pairs.
Keywords
Transcription regulatory element; clustering analysis;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Dennis, G., Jr., Sherman, B. T. et al. (2003). DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome. Biol. 4, P3   DOI
2 Germain-Desprez, D., Brun, T. et al. (2001). The SMN genes are subject to transcriptional regulation during cellular differentiation. Gene. 279, 109-117   DOI   ScienceOn
3 Kim, J., Seo, J. et al. (2005). TFExplorer: integrated analysis database for predicted transcription regulatory elements. Bioinformatics 21, 548-550   DOI   ScienceOn
4 Pruitt, K. D., Tatusova, T. et al. (2003). NCBI Reference Sequence project: update and current status. Nucleic Acids Res. 31, 34-37   DOI   ScienceOn
5 Sosinsky, A., Bonin, C. P. et al. (2003). Target Explorer: An automated tool for the identification of new target genes for a specified set of transcription factors. Nucleic Acids Res. 31, 3589-3592   DOI   ScienceOn
6 Sinha, S. and Tompa, M. (2003). YMF: A program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 31, 3586-3588   DOI   ScienceOn
7 Ashburner, M., Ball, C. A. et al. (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29
8 Dai, H., Tian, B. et al. (2004). Dynamic integration of gene annotation and its application to microarray analysis. J. Bioinform. Comput. Biol. 1, 627-645   DOI   ScienceOn
9 Maurer, M., Molidor, R. et al. (2005). MARS: microarray analysis, retrieval, and storage system. BMC Bioinformatics 6, 101   DOI   ScienceOn
10 Prestridge, D. S. (1995). Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 249, 923-932   DOI   ScienceOn
11 Tavazoie, S., Hughes, J. D. et al. (1999). Systematic determination of genetic network architecture. Nat. Genet. 22, 281-285   DOI   ScienceOn
12 Boyle, E. I., Weng, S. et al. (2004). GO:TermFinder--open source software for accessing Gene Ontdogy information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20, 3710-3715   DOI   ScienceOn
13 Sausville, E. A. and Holbeck, S. L. (2004). Transcripion profiling of gene expression in drug discovery and development: the NCI experience. Eur. J. Cancer 40, 2544-2549   DOI   ScienceOn
14 Liu, Y., Wei, L. et al. (2004). A suite of web-based programs to search for transcriptional regulatory motifs. Nucleic Acids Res. 32(Web Server issue), W204-W207   DOI   ScienceOn
15 Roth, F. P., Hughes, J. D. et al. (1998). Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat. Biotechnol. 16, 939-945   DOI   ScienceOn
16 Villard, J. (2004). Transcription regulation and human diseases. Swiss Med. Wkly. 134, 571-579
17 Martin, D., Brun, C. et al. (2004). GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol. 5, R101   DOI
18 Gupta, S., Vingron, M. et al. (2005). T-STAG: resource and web-interface for tissue-specific transcripts and genes. Nucleic Acid. Res. 33(Web Server issue), W654-W658   DOI   ScienceOn
19 Kellis, M., Patterson, N. et al. (2003). Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241-254   DOI   ScienceOn
20 Knuppel, R., Dietze, P. et al. (1994). TRANSFAC retrieval program: a network model database of eukaryotic transcription regulating sequences and proteins. J. Comput. Biol. 1, 191-198   DOI   ScienceOn
21 Chung, H. J., Kim, M. et al. (2004). ArrayXPath: mapping and visualizing microarray gene-expression data with integrated biological pathway resources using Scalable Vector Graphics. Nucleic Acids. Res. 32(Web Server issue), W460-W464   DOI   ScienceOn
22 Kasturi, J. and Acharya, R. (2005). Clustering of diverse genomic data using information fusion. Bioinformatics 21, 423-429   DOI   ScienceOn
23 Walsh, B. and Henderson, D. (2004). Microarrays and beyond: what potential do current and future genomics tools have for breeders? J. Anim. Sci. 82 E-Suppl, E292-E299   DOI
24 Sandelin, A., Wasserman, W. W. et al. (2004). ConSite: web-based prediction of regulatory elements using cross-species comparison. Nucleic Acids Res.32(Web Server issue), W249-W252   DOI   ScienceOn
25 Lobenhofer, E. K., Bushel, P. R. et al. (2001). Progress in the application of DNA microarrays. Environ Health Perspect 109, 881-891   DOI
26 Murakami, K., Kojima, T. et al. (2004). Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression. BMC Genomics 5, 16   DOI   ScienceOn
27 Eisen, M. B., Spellman, P. T. et al. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863-14868
28 Solovyev, V. and Salamov, A. (1997). The Gene-Finder computer tools for analysis of human and model organisms genome sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 294-302
29 Jenuth, J. P. (2000). The NCBI. Publicly available tools and resources on the Web. Methods Mol. Biol. 132, 301-312
30 Yang, X., Long, L. et al.(2005). Dysfunctional Smadsignaling contributes to abnormal smooth muscle cell proliferation in familial pulmonary arterial hypertension. Circ. Res. 96, 1053-1063   DOI   ScienceOn
31 Shamir, R., Maron-Katz, A. et al. (2005). EXPANDER--an integrative program suite for microarray data analysis. BMC Bioinformatics 6, 232   DOI   ScienceOn
32 Sandmann, T., Jensen, L. J. et al. (2006). A temporal map of transcription factor activity: mef2 directly regulates target genes at all stages of muscle development. Dev. Cell 10, 797-807   DOI   ScienceOn