An Efficient Method for Multiple Sequence Alignment using Subalignment Refinement

부분서열정렬 개선 기법을 사용한 효율적인 복수서열정렬에 관한 알고리즘

  • 김진 (한림대학교 정보통신공학부) ;
  • 정우철 (한림대학교 컴퓨터공학과) ;
  • 엄상용 (한림대학교 교양교육부)
  • Published : 2003.10.01

Abstract

Multiple sequence alignment is a useful tool to identify the relationships among protein sequences. Dynamic programming is the most widely used algorithm to obtain multiple sequence alignment with optimal cost. However, dynamic programming cannot be applied to certain cost function due to its drawback and cannot be used to produce optimal multiple sequence alignment. We propose sub-alignment refinement algorithm to overcome the problem of dynamic programming. Also we show proposed algorithm can solve the problem of dynamic programming efficiently.

단백질들의 복수서열정렬은 단백질 서열간의 관계를 유추할 수 있는 유용한 도구이다. 최적화된 복수서열정렬을 얻기 위해 사용되는 가장 유용한 방법은 dynamic programming이다. 그러나 dynamic programming은 특정한 비용함수를 사용할 수 없기 때문에 특별한 경우 최소의 비용을 가지는 복수서열 정렬을 제공하지 못하는 문제점이 있다. 우리는 이러한 문제점을 해결하기 위하여 부분서열정렬 개선기법을 사용한 알고리즘을 제안하였으며, 이 알고리즘이 dynamic programming의 문제점을 효과적으로 해결함을 보였다.

Keywords

References

  1. Needleman, S.B and Wunch, C. D. 'A general method applicable to the serch for similarities in the amino acid sequence of two proteins,' J. Molec. Biol., Vol. 48, pp. 443-453, 1970 https://doi.org/10.1016/0022-2836(70)90057-4
  2. Chen, S. C. C., Wong, A. K., and Chiu, D. K. Y. 'A survey of multiple sequence comparison methods,' Bull. Math. Bio., Vol.43, pp.563-598, 1992 https://doi.org/10.1007/BF02459635
  3. Feng, D. F., Johnson, M. S., and Doolittle, R. F. 'Aligning amino acid sequences: comparison of commonly used methods,' J. Molec. Evol., Vol.21, pp. 112-125, 1982 https://doi.org/10.1007/BF02100085
  4. Ficket, J. W. 'Fast optimal alignment,' Nucl. Acids Res., Vol.12 pp.175-180, 1984 https://doi.org/10.1093/nar/12.1Part1.175
  5. Martinez, H. M. 'A flexible multiple sequence alignment program,' Nucl. Acids. Res., Vol16, pp.1683-1691, 1988 https://doi.org/10.1093/nar/16.5.1683
  6. Notredame, C. and Higgins, D. 'SAGA:sequence alignment by genetic algorithm,' Nucl. Acids. Res., Vol.24, No.8, pp.1515-1524, 1996 https://doi.org/10.1093/nar/24.8.1515
  7. Sankoff, D. 'Simultaneous comparison of three or more sequence related by a tree,' Addison Wesley, Reading, MA, 1983
  8. Sankoff, D. and Kruskal, J. B. 'Time Warps, String Edits and macromolecules: The theory and practice of Sequence Comparison,' Addison Wesley, Reading, MA, 1983
  9. Smith, T. F and Waterman, M. S. 'Identification of common molecular subsequence,' J. Mol. Biol. Vol. 147, pp.195-197, 1981 https://doi.org/10.1016/0022-2836(81)90087-5
  10. Taylor, W. R. 'Multiple sequence alignment by a pairwise algorithm,' CABIOS, Vol.3, pp.81-87, 1987
  11. Feng, D. F. and Doolittle, R. F. 'Progressive sequence alignment as a prerequisite to correct phylogenetic trees,' J. Molec. Evol., 25:351-360, 1987 https://doi.org/10.1007/BF02603120
  12. Altschul, S. F. and Lipman, D. J. 'Threes, stars, and multiple biological sequence alignment,' SIMA J. appl. Math, Vol.49 pp.197-209, 1989 https://doi.org/10.1137/0149012
  13. Murata, M., Richardson, J. S., and Sussman, J. L. 'Simultaneous comparison of three protein sequences,' In Proc. Natil. Acad. Sci. U.S.A., Vol.82, pp.3073-3077, 1985 https://doi.org/10.1073/pnas.82.10.3073
  14. Altschul, S. F. 'Gap costs for multiple sequence alignment,' J. Theor. Biol., Vol.138 pp.297-309. 1989 https://doi.org/10.1016/S0022-5193(89)80196-1
  15. Kim, J. and Pramanik, S. 'An effecient method for multiple seqence alignment,' In Second International Conference on Intelligent Systems for Molecular Biology, 1994
  16. Kim, J. and Pramanik, S. and M. J. Chung, 'Multiple sequence alignment using simulated annealing,' CABIOS, Vol.10, pp.419-426, 1994 https://doi.org/10.1093/bioinformatics/10.4.419
  17. Lipman, D. J., Altschul, S. F. and Kececioglu, J. D. 'A tool for multiple sequence alignment,' Proc. Natl. Acad. Sci. U.S.A., Vol.86, pp. 4412-4415, 1989 https://doi.org/10.1073/pnas.86.12.4412
  18. Dayhoff, M. O. 'A model of evolutionary change in proteins. matrices for detecting distance relationships,' In Atlas of Protein sequence on Structure, Vol. 5, suppl.3, pp.354-352. Dayhoff, M. O.(ed.) Washington. DC: National Biomedical Research Foundation, 1978