Browse > Article

An Efficient Local Alignment Algorithm for DNA Sequences including N and X  

Kim, Jin-Wook (인하대학교 컴퓨터정보공학부)
Abstract
A local alignment algorithm finds a substring pair of given two strings where two substrings of the pair are similar to each other. A DNA sequence can consist of not only A, C, G, and T but also N and X where N and X are used when the original bases lose their information for various reasons. In this paper, we present an efficient local alignment algorithm for two DNA sequences including N and X using the affine gap penalty metric. Our algorithm is an extended version of the Kim-Park algorithm and can be extended in case of including other characters which have similar properties to N and X.
Keywords
DNA sequences; local alignment; affine gap penalty;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Batzoglou, D.B. Jaffe, K. Stanley, J. Butler, et al., ARACHNE: A Whole-Genome Shotgun Assembler, Genome Research, 12, pp.177-189, 2002.   DOI   ScienceOn
2 J. Wang, G.K. Wong, P. Ni, et al., RePS: A Sequence Assembler that Masks Exact Repeats Identified from the Shotgun Data, Genome Research, 12, pp.824-831, 2002.   DOI   ScienceOn
3 T.F. Smith, M.S. Waterman, Identification of Common Molecular Subsequences, Journal of Molecular Biology, 147, pp.195-197, 1981.   DOI
4 O. Gotoh, An Improved Algorithm for Matching Biological Sequences, Journal of Molecular Biology, 162, pp.705-708, 1982.   DOI
5 J.W. Kim, K. Park, An Efficient Alignment Algorithm for Masked Sequences, Theoretical Computer Science, 370, pp.19-33, 2007.   DOI   ScienceOn
6 D. Gusfield, Algorithms on Strings, Trees, and Sequences, Cambridge University Press, New York, 1997.
7 E.W. Myers, G.G. Sutton, A.L. Delcher, I.M. Dew, D.P. Fasulo, et al., A Whole-Genome Assembly of Drosophila, Science, 287, pp.2196-2204, 2000.   DOI   ScienceOn
8 J.W. Kim, K. Roh, K. Park, H. Park, J. Seo, MLP: Mate-Based Sequence Layout with PHRAP, Bioinformatics and Biosystems, 1(1), pp.61-66, 2006.
9 NC-UIB, Nomenclature for Incompletely Specified Bases in Nucleic Acid Sequences. Recommendations 1984, The European Journal of Biochemistry, 150, pp.1-5, 1985.   DOI   ScienceOn
10 J.W. Kim, A. Amir, G.M. Landau, K. Park, Computing Similarity of Run-Length Encoded Strings with Affine Gap Penalty, Theoretical Computer Science, 395, pp.268-282, 2008.   DOI   ScienceOn
11 P. Green, PHRAP, http://www.phrap.org.