Browse > Article

Computational Detection of Prokaryotic Core Promoters in Genomic Sequences  

Kim Ki-Bong (Department of Bioinformatics Engineering, Sangmyung University)
Sim Jeong Seop (Department of Computer Science and Engineering, Inha University)
Publication Information
Journal of Microbiology / v.43, no.5, 2005 , pp. 411-416 More about this Journal
Abstract
The high-throughput sequencing of microbial genomes has resulted in the relatively rapid accumulation of an enormous amount of genomic sequence data. In this context, the problem posed by the detection of promoters in genomic DNA sequences via computational methods has attracted considerable research attention in recent years. This paper addresses the development of a predictive model, known as the dependence decomposition weight matrix model (DDWMM), which was designed to detect the core promoter region, including the -10 region and the transcription start sites (TSSs), in prokaryotic genomic DNA sequences. This is an issue of some importance with regard to genome annotation efforts. Our predictive model captures the most significant dependencies between positions (allowing for non­adjacent as well as adjacent dependencies) via the maximal dependence decomposition (MDD) procedure, which iteratively decomposes data sets into subsets, based on the significant dependence between positions in the promoter region to be modeled. Such dependencies may be intimately related to biological and structural concerns, since promoter elements are present in a variety of combinations, which are separated by various distances. In this respect, the DDWMM may prove to be appropriate with regard to the detection of core promoter regions and TSSs in long microbial genomic contigs. In order to demonstrate the effectiveness of our predictive model, we applied 10-fold cross-validation experiments on the 607 experimentally-verified promoter sequences, which evidenced good performance in terms of sensitivity.
Keywords
contig; core promoter; DDWMM; MDD procedure; TSS; 10-fold cross-validation;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
Times Cited By Web Of Science : 2  (Related Records In Web of Science)
Times Cited By SCOPUS : 1
연도 인용수 순위
1 Burge, C. and S. Karlin. 1997. Prediction of complete gene structure in human genomic DNA. J. Mol. Biol. 268, 78-94   DOI   ScienceOn
2 Frech, K., K. Quandt, and T. Werner. 1997. Software for the analysis of DNA sequence elements of transcription. Comput. Appl. Biosci. 13, 89-97   PUBMED
3 Hertz, G.Z., G.W. Hartzell III, and G.D. Stormo. 1990. Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput. Applic. Biosci. 6, 81-92
4 Kim, E.Y., M.S. Shin, J.H. Rhee, and H.E. Choy. 2004. Factor influencing preferential utilization of RNA polymerase containing sigma-38 in stationary-phase gene expression in Escherichia coli. J. Microbiol. 42, 103-110   PUBMED
5 Mount, D.W. 2001. Bioinformatics : sequence and genome analysis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York
6 Schneider, T.D. and R.M. Stephens. 1990. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097-6100   DOI   ScienceOn
7 Salgado, H., S. Gama-Castro, A. Martinez-Antonio, E. Diaz-Peredo, F. Sanchez-Solano, M. Peralta-Gil, D. Garcia-Alonso, V. Jimenez- Jacinto, A. Santos-Zavaleta, C. Bonavides-Martinez, and J.H. Collado-Vides. 2004. RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12. Nucleic. Acids Res. 29, 72-74
8 Thieffry, D., H. Salgado, A.M. Huerta, and J. Collado-Vides. 1998. Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12. Bioinformatics. 14, 391-400   DOI   ScienceOn
9 Sinha, S. and M. Tompa. 2002. Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 30, 5549-5560   DOI   ScienceOn
10 Collado-Vides, J. 1992. Grammatical model of the regulation of gene expression. Proc. Natl. Acad. Sci. USA. 89, 9405-9409
11 Ohler, U. and H. Niemann. 2001. Identification and analysis of eukaryotic promoters: recent computational approaches. Trends in Genetics. 17, 56-60   DOI   ScienceOn
12 Fickett, J. and A. Hatzigeorgiou. 1997. Eukaryotic promoter recognition. Genome Research. 7, 861-878   PUBMED
13 Gross, C.A. and M. Lonetto. 1992. Bacterial sigma factors. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York
14 Pedersen, A., P. Baldi, Y. Chauvin, and S. Brunak. 1999. The biology of eukaryotic promoter prediction - a review. Comput. Chemistry. 23, 191-207   DOI   ScienceOn
15 Hernandez, E., A. Johnson, V. Notario, A. Chen, and J. Richert. 2002. AUA as a translation initiation site in vitro for the human transcription factor Sp3. J. Biochem. Mol. Biol. 35, 273-282   DOI   PUBMED
16 Ko, J., D.S. Na, Y.H. Lee, S.Y. Shin, J.H. Kim, B.G. Hwang, B.I. Min, and D.S. Park. 2002. cDNA microarray analysis of the differential gene expression in the neuropathic pain and electroacupunction treatment models. J. Biochem. Mol. Biol. 35, 420-427   DOI   PUBMED
17 Jones, B.D. 2005. Salmonella invasion gene regulation: a story of environmental awarenss. J. Microbiol. 43, 110-117