DOI QR코드

DOI QR Code

A Study On the Application Methods of a Support Vector Machine for Gene Promoter Prediction.

유전자 프로모터 예측을 위한 Support Vector Machine의 응용 방법에 대한 연구

  • Kim, Ki-Bong (Department of Bioinfomatics Engineering, Sangmyung University)
  • 김기봉 (상명대학교 공과대학 생명정보공학과)
  • Published : 2007.05.25

Abstract

The high-throughput sequencing of a lot of genomes has resulted in the relatively rapid accumulation of an enormous amount of genomic sequence data. In this context, the problem posed by the detection of promoters in genomic DNA sequences via computational methods has attracted considerable attention in recent years since exact promoter prediction can give a clue to the elucidation of overall genetic networks. In this study, applications of support vector machine(SVM) to promoter prediction are explored to show a right approaches to discriminate between promoter and non-promoter regions by means of SVM. The results of various experiments show that encoding method, encoding region and learning data constitution can play an important role in the performance of SVM.

유전자의 구조 예측 및 발현 기작에 대한 연구는 매우 중요한 사안으로 대두되고 있다. 특히 유전자 발현 제어에 중요한 역할을 하는 프로모터 영역을 예측하는 것은 전체 생명체 네트워크 규명을 위한 단초를 제공하기 때문에 많은 연구가 이루어지고 있다. 본 논문에서는 이러한 진핵생물의 유전자 프로모터 예측을 위한 Support Vector Machine(SVM) 활용방안에 대한 연구내용을 다루고 있다. 특성 벡터 값 생성을 위한 인코딩 방법 및 학습 데이터들의 구성에 대한 다양한 실험을 통해 SVM활용 방안에 대한 올바른 방향을 제시하고 있다.

Keywords

References

  1. Fickett, J. W. and A. C. Hatzigeorgiou. 1997. Eukaryotic promoter recognition. Genome Res. 7, 839-844
  2. Fofanov, Y., Y. Luo, C. Katili, J. Wang, Y. Belosludtsev, T. Powdrill, C. Belapurkar, V. Fofanov, T. Li, S. Chumakov and B. Pettitt. 2004. How independent are the appearance of n-mers in different genomes. Bioinformatics 20, 2421-2428 https://doi.org/10.1093/bioinformatics/bth266
  3. Gangal, R. and P. Sharma. 2005. Human pol II promoter prediction: time series descriptors and machine learning. Nucleic Acids Res. 33(4), 1332-1336 https://doi.org/10.1093/nar/gki271
  4. Gordon, L., A. Chervonenkis, A. Gammerman, I. Shahmuradov and V. Solovyev. 2003. Sequence alignment kernel for recognition of promoter regions. Bioinformatics 19(15), 1964-1971 https://doi.org/10.1093/bioinformatics/btg265
  5. Joachims, T. 1999. Advances in Kernel Methods - Support Vector Learning. pp. 169-184, MIT Press. Cambridge, MA USA
  6. Jung, M., W. Park and K. Kim. 2004. Development of integrated system for motif and domain search. Journal of Life Science 14(6), 991-996 https://doi.org/10.5352/JLS.2004.14.6.991
  7. Jung, M., W. Park and K. Kim. 2004. Development of web-based assistant system for protein-protein interaction and function analysis. Journal of Life Science 14(6), 997-1002 https://doi.org/10.5352/JLS.2004.14.6.997
  8. Kulp, D, D. Haussler, M.G. Reese and F. H. Eeckman. 1996. A generalized Hidden Markov Model for the recognition of human genes in DNA. Proc Int Conf Intell Syst Mol Biol. 4, 134-42
  9. Perier, R. C., V. Praz, T. Junier, C. Bonnard and P. Bucher . 2000. The Eukaryotic Promoter Database (EPD). Nucleic Acids Research 28, 302-303 https://doi.org/10.1093/nar/28.1.302
  10. SVMlight, http://svmlight.joachims.org/
  11. Zhang, Y., C. Chu, Y. Chen, H. Zha and X. Ji. 2006. Splice site prediction using support vector machines with a Bayes kernel. Expert Systems with Applications 30, 73-81 https://doi.org/10.1016/j.eswa.2005.09.052
  12. http://www.integratedgenomics.com

Cited by

  1. PromoterWizard: An Integrated Promoter Prediction Program Using Hybrid Methods vol.9, pp.4, 2011, https://doi.org/10.5808/GI.2011.9.4.194