DOI QR코드

DOI QR Code

모티프 자원 통합을 이용한 단백질 모티프 예측 시스템 구현

Implementation of Protein Motif Prediction System Using integrated Motif Resources

  • 이범주 (충북대학교 대학원 전자계산학과) ;
  • 최은선 (충북대학교 대학원 전자계산학과) ;
  • 류근호 (충북대학교 전기전자 및 컴퓨터공학과)
  • 발행 : 2003.08.01

초록

지놈 서열 시퀀싱을 통해 생성되는 원시 데이터에 대한 단백질 기능 및 구조 예측에 사용되는 모티프 데이터베이스들은 원시 데이터들의 폭발적인 성장추세에 맞추어 그 사용빈도가 증가하고 있다. 그러나 이러한 모티프 데이터베이스들은 독자적으로 개발, 발전하여왔고 웹 기반 cross-reference를 이용한 논리적 통합을 추진하여왔기 때문에 이질적인 검색 결과와 복잡한 질의 처리 문제, 중복된 데이터베이스 엔트리 핸들링 문제 등을 갖고 있다. 따라서, 이 논문에서는 이런 문제점들을 개선하기 위하여 물리적인 모티프 자원 통합을 제안하고, 패밀리 기반 단백질 예측 메소드들에 대한 통합 검색 방법을 기술한다. 끝으로 모티프 통합 데이터베이스 구축 및 단백질 모티프 예측 시스템 구현을 통한 결과를 평가한다.

Motif databases are used in the function and structure prediction of proteins which appear on new and rapid release of raw data from genome sequencing projects. Recently, the frequency of use about these databases increases continuously. However, existing motif databases were developed and extended independently and were integrated mainly by using a web-based cross-reference, thus these databases have a heterogeneous search result problem, a complex query process problem and a duplicate database entry handling problem. Therefore, in this paper, we suppose physical motif resource integration and describe the integrated search method about a family-based protein prediction for solving above these problems. Finally, we estimate our implementation of the motif integration database and prediction system for predicting protein motifs.

키워드

참고문헌

  1. 김성진, 이상호, '객체-관계형 데이터베이스 시스템을 위한 새로운 성능 평가 방법론,' 정보처리학회논문지, 제7권 제7호, 2000
  2. R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, E. Birney, M. Biswas, P. Bucher, L. Cerutti, F. Corpet, M. D. R. Croning, R. Durbin, L. Falquet, W. Fleischmann, J. Gouzy, H. Hermjakob, N. Hulo, L. Jonassen, D. Kahn, A. Kanapin, Y. Karavidopoulou, R. Lopez, B. Marx, N. J. Mulder, T. M. Oinn, M. Pagni, F. Servant, C. J. A. Sigrist and E. M. Zdobnov, 'The InterPro database, an integrated documentation resource for protein families, domains and functional sites,' Nuleic Acids Research, Vol.29, No. 1, pp.37-40, 2001 https://doi.org/10.1093/nar/29.1.37
  3. M. R. Wilkins, K. L. Williams, R. D. Appel, D. F. Hochstrasser, 'Proteome Research : New Frontiers in Functional Genomics,' Springer-Verlag Berlin Heidelberg, pp.109-175, 1997
  4. Minoru Kanehisa, 'Post-Genome Informatics,' Oxford University Press, 2000
  5. David W. Mount, 'Bioinformatics : Sequence and Genome Analysis,' Cold Spring Harbor Laboratory press, pp.45-48, 2001
  6. Kevin A. T. Silverstein, Alan Kilian, John L. Freeman, James E. Johnson, Ihab A. Awad, Ernest F. Retzel, 'PANAL : an integrated resource for Protein sequence ANALysis,' Bioinformatics, Vol.16, pp.1157-1158, 2000 https://doi.org/10.1093/bioinformatics/16.12.1157
  7. T. K. Attwood, M. E. Beck, D. R. Flower, P. Scordis, N. Selley, 'The PRINTS protein fingerprint database in its fifth year,' Nucleic Acids Research, Vol.26, No.1, pp.304-308, 1998 https://doi.org/10.1093/nar/26.1.304
  8. Alex Bateman, Evan Birney, Lorenzo Cerruti, Richard Durbin, Laurence Etwiller, Sean R.Eddy, Sam Griffiths-Jones, Kevin L. Howe, Mhairi Marshall, Erik L.L.Sonnhammer, 'The Pfam Protein Families Database,' Nucleic Acids Research, Vol.30, No.1, pp.276-280, 2002 https://doi.org/10.1093/nar/30.1.276
  9. Jorja G. Henikoff, Steven Henikoff, Shmel Pietrokovski, 'New features of the Block Database servers,' Nucleic Acids Research, Vol.27, No.1, pp.226-228, 1999 https://doi.org/10.1093/nar/27.1.226
  10. T. K. Attwood, H. Avision, M. E. Beck, M. Bewley, A. J. Bleasby, F. Brewster, P. Cooper, K. Degtyarenko, A. J. Geddes, D. R. Flower, M. P. Kelly, S. Lott, K. M. Measures, D. J. Parry-Smith, D. N. Perkins, P. Scordis, D. Scott, C. Worledge, 'The PRINTS Database of Protein Fingerprints : A Novel Information Resource for Computational Molecular Biology,' J. Chem. Inf. Comput. Sci37, pp.417-424, 1997 https://doi.org/10.1021/ci960468e
  11. Laurent Falquet, Marco Pagni, Philipp Bucher, Nicolas Hulo, Christian J.A.Sigrist, Kay Hofmann, Amos Bairoch, 'The PROSITE database, its status in 2002,' Nucleic Acids Research, Vol.30, pp.235-238, 2002 https://doi.org/10.1093/nar/30.1.235
  12. Helen M. Berman, John Westbrook, Zukang Feng, Gary Gililand, T. N. Bhat, Helge Weissing, Ilya N. Shindyalov, Philip E. Bourne, 'The Proten Data Bank,' Nucleic Acids Research, Vol.18, pp.235-242, 2000
  13. Etzold T., Ulyanov A, Argos P., 'SRS : information retrieval system for molecular biology data banks,' Methods Enzymol, pp.114-128, 1996
  14. Ramez Elmasri, Shamkant B. Navathe, 'Fundamentals of Database Systems,' Addison-Wesley, Reading, Massachusetts, 2000
  15. Philip Scordis, Darren R. Flower, Teresa K. Attwood, 'FingerPRINTScan : intellegent searching of the PRINTS motif database,' Bioinformatics, Vol.15, No. 10, pp.799-806, 1999 https://doi.org/10.1093/bioinformatics/15.10.799
  16. T. K. Attwood, M. J. Blythe, D. R. Flower, A. Gaulton, J. E. Mabey, N. Maudling, L. McGregor, A. L. Mitchell, G. Moulton, K. Paine, P. Scordis, 'PRINTS and PRINT-S shed light on protein ancestry,' Nucleic Acids Research, Vol.30, No.1, pp.239-241, 2002 https://doi.org/10.1093/nar/30.1.239
  17. Philip Bucher, Kevin Karplus, Nicolas Moeri, Kay Hofmann, 'A Flexible Motif Search Technique Based on Generalized Profiles,' Comput. Chem., Vol.20, pp.3-24, 1996 https://doi.org/10.1016/S0097-8485(96)80003-9
  18. Doug Brutlag, 'Protein Structure & Motifs,' Biochemistry 201, Molecular Biology, , 2000
  19. Cynthia Gibas, Per Jambeck, 'Developing Bioinformatics Computer Skills,Developing Bioinformatics Computer Skills,' O'REILLY, pp.290-295, 2001
  20. Attwood, The Babel of Bioinformatics, Science 290, pp.471-473, 2000 https://doi.org/10.1126/science.290.5491.471
  21. Florence Corpet, Florence Servant, Jerome Gouzy, Daniel Kahn, 'ProDom and ProDom-CG : tools for protein domain analysis and whole genome comparisons,' Nucleci Acids Research, Vol.28, No.1, pp.267-269, 2000 https://doi.org/10.1093/nar/28.1.267
  22. Barbara Eckman, Julia Rice, Bill Swope, 'Heterogeneous Data and Algorithm Integration in Bioinformatics,' ISMB 10th International Conference Tutorial, 2002