Multiple Sequence Aligmnent Genetic Algorithm

진화 알고리즘을 사용한 복수 염기서열 정렬

  • Kim, Jin (Department of computer science, Konkuk Univ.) ;
  • Song, Min-Dong (Department of molecular biology, Konkuk Univ.) ;
  • Choi, Hong-Sik (Div. of computer engineering Hallym University) ;
  • Chang, Yeon-Ah (Div. of computer engineering Hallym University)
  • 김진 (건국대학교 자연과학대학 전산과학과) ;
  • 송민동 (건국대학교 분자생물학과) ;
  • 최홍식 (한림대 컴퓨터공학부) ;
  • 장연아 (한림대 컴퓨터공학부)
  • Published : 1999.06.01

Abstract

Multiple Sequence Alignment of DNA and protem sequences is a imnport'mt tool in the study of molecular evolution, gene regulation. and prolein suucture-function relationships. Progressive pairwise alignment method generates multiple sequence alignment fast but not necessarily with optimal costs. Dynamic programming generates multiple sequence alig~~menl with optimal costs in most cases but long execution time. In this paper. we suggest genetlc algorithm lo improve the multiple sequence alignment generated from the cnlent methods, describe the design of the genetic algorithm, and compare the multiple sequence alignments from 0111 method and current methods.

3개 이상의 DNA 혹은 단백질의 염기서열을 정렬하는 복수 염기서열 정렬은 염기서열들 사이의 진화관계, gene regulation, 단백질의 구조와 기능에 관한 연구에 필수적인 도구이다. 복수 염기서열 정렬을 얻기 위한 기존의 방법은 progressive pairwise alignment 와같이 빠른 실행시간 내에 만족할 만한 복수 염기서열 정렬을 제공하는 방법과, 최적의 복수 여기서열 정렬을 제공하나 실행시간이 상대적으로 긴 dynamic programming 과 같은 방법 등이 있다. 본 논문에서는 진화 알고리즘을 사용하여 기존의 방법에서 제공하는 복수 염기서열 정렬을 짧은 시간내에보다 개선된 복수 염기서열 정렬을 획득하게 하는 방법을 제시하였으며, 진화 알고리즘의 구성내용을 설명하였으며, 실제의 염기서열을 사용하여 이 방법의 장점을 보였다.

Keywords

References

  1. Journal of Theor. Biol. v.138 Gap costs for multiple sequence alignment Altschul, S.F.
  2. SIAM J. Appl. Math v.49 Trees, stars, and multiple biological sequence alignment Altschul, S. F.;D. J. Lipman
  3. J. Mol. Biol. v.191 Multiple sequence alignment Bacon, D. G.;W.F. Anderson
  4. Bull. Math. Biol. v.54 A survey of multiple sequence comparison methods Chan, S.C.;A.K.C. Wong;D.K.Y. Chiu
  5. SIAM J. Appl. Math v.48 The multiple sequence alignment problem in biology Carrillo, H.;D. Lipman
  6. J. Mol. Evol. v.25 Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Feng, D.;R.F. Doolittle
  7. Nucl. Acids Res v.12 Fast optimal alignment Fickett, J. W.
  8. A guide to the theory of NP-completeness Computers and Intractability Garey, M.R.;D.S. Johnson
  9. Genetic algorithm in search, optimization, and machine learning Goldberg, D.E.
  10. J. Mol. Biol. v.162 An improved algorithm for matching biological sequences Gotoh, O.
  11. Comp. Appl. Biosci. v.10 Multiple sequence alignment using simulated annealing Kim, J.;S. Pramanik;M.J. Chung
  12. Science v.220 Optimization by simulated annealing Kirkpatrick,S.;C.D. Gelatt;M. P. Vecchi
  13. Proc. Natl. Acad. Sci. USA v.86 Tool for multiple sequence alignment Lipman, D.J.;S.F.Altschul;J.D.Kececioglu
  14. J. Chem. Phys. v.21 Equation of state calculation by fast computing machines Metropolis, M.;M.Rosenbluth;A. Rosenbluth.;A. Teller;E. Teller
  15. Comput. Applicat. Biosci. v.9 Building multiple alignment from pairwise alignments Miller, W.
  16. J. Mol. Biol. v.48 A general method applicable to the search for similarities in the amino acid sequences of two proteins. Needleman, S. B.;C.D. Wunch
  17. Nucl. Acids Res. v.24 no.8 SAGA : sequence alignment by genetic algorithm Notredame, C.;D.G. Higgins
  18. Proc. Natl. Acad. Sci. USA v.64 Matching sequence under deletion-insertion constraints Sankoff, D.
  19. J. Mol. Biol. v.147 Identification of common molecular subsequences Smith, T.F.;M.S.Waterman
  20. Bull. Math. Biol. v.46 General methods of sequence comparison Waterman, M.S.