Browse > Article

Optimal Sequence Alignment Algorithm Using Space Division Technique  

Ahn, Heui-Kook (강원대학교 컴퓨터과학과)
Roh, Hi-Young (강원대학교 컴퓨터과학과)
Abstract
The problem of finding an optimal alignment between sequence A and B can be solved by dynamic programming algorithm(DPA) efficiently. But, if the length of string was longer, the problem might not be solvable because it requires O(m*n) time and space complexity.(where, $m={\mid}A{\mid},\;n={\mid}B{\mid}$) For space, Hirschberg developed a linear space and quadratic time algorithm, so computer memory was no longer a limiting factor for long sequences. As computers's processor and memory become faster and larger, a method is needed to speed processing up, although which uses more space. For this purpose, we present an algorithm which will solve the problem in quadratic time and linear space. By using division method, It computes optimal alignment faster than LSA, although requires more memory. We generalized the algorithm about division problem for not being divided into integer and pruned additional space by entry/exit node concept. Through the proofness and experiment, we identified that our algorithm uses d*(m+n) space and a little more (m*n) time faster than LSA.
Keywords
Sequence Optimal Alignment; Dynamic Programming Algorithm; Linear Space Algorithm; Division Generalization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Dennis A. Benson, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell and David L. Wheeler. 'GeneBank,' Nucleic Acids Research, Vol. 24, Database issue, 2006   DOI   ScienceOn
2 Andreas D. Baxevanis and B. F. Francis Ouellette. 'Bioinformatics : A practical guide to the analysis of genes and progeins,' 2-ed; Willey & Sons, 2001
3 Tao Jiang, Ying Xu and Michael Q. Zhang, 'Current topics in computational molecular biology,' The MIT Press, 5, Cambridge Center, Cambridge, 2002
4 Steven L. Salzberg. 'Gene Discovery in DNA Sequences,' IEEE. Intelligent System. Nov. 1999   DOI   ScienceOn
5 Wagner, R. A. and Fischer, M. J. 'The string-to-string correction problem,' J. ACM 21, 168-173, Jan. 1974   DOI   ScienceOn
6 Setubal, J. and Meidanis, J. 'Introduction to Computational Molecular Biology,' PWS Publishing Company, 1997
7 Arthur M. Lesk. 'Introduction to Bioinformatics,' Oxford University Press Inc, 2002
8 Hirschberg, D. S. 'A linear space algorithm for computing maximal common subsequences,' Commun. Assoc. Comput. March. vol. 18: 341-343, 1975   DOI   ScienceOn
9 Myers, E. W. and Miller, W. 'Optimal alignments in linear space,' Comput. Applic. Biosci. vol. 4: 11-17, 1988   DOI
10 F. S. Roberts. 'Applied Combinatorics,' Prentice-Hall, 1984
11 Delcher, A.L., Kasif, S., Fleishmann, R.D., Peterson, J., White, O. & Salzberg, S.L. 'Alignment of whole genomes,' Nucleic Acids Research 27, 2369-2376., 1999   DOI
12 Brudno, M. & Morgenstern, B. 'Fast and sensitive alignment of large genomic sequences,' in 'IEEE Computer Society Bioinformamtics Conference 2002', Standford University, CA, USA, pp. 138-147., 2002   DOI
13 Ning, Z., et al, 'SSAHA: a fast search method for large DNA databases,' Genome Research 11(10), 1725-1729., 2001   DOI
14 Needleman, S. B. and Wunsch C.D. 'A general method applicable to the search for similarities in the amino acid sequences of two proteins,' J. Mol. Biol. vol. 48: 443-453, 1970   DOI
15 Cohen J. 'Bioinformatics: An introduction for computer scientists,' ACM Comput. Surv. Vol. 36, No. 2, 122-158, Jun. 2004   DOI   ScienceOn
16 Schwartz, S. Zhang, Z., Frazer, K. A., et al, 'PipMaker-A Web Server for Aligning Two Genomic DNA Sequences,' Genome Research 10, 577-686., 2002   DOI   ScienceOn