Browse > Article
http://dx.doi.org/10.9723/jksiis.2010.15.1.037

Suffix Tree Constructing Algorithm for Large DNA Sequences Analysis  

Choi, Hae-Won (경운대학교 컴퓨터공학과)
Publication Information
Journal of Korea Society of Industrial Information Systems / v.15, no.1, 2010 , pp. 37-46 More about this Journal
Abstract
A Suffix Tree is an efficient data structure that exposes the internal structure of a string and allows efficient solutions to a wide range of complex string problems, in particular, in the area of computational biology. However, as the biological information explodes, it is impossible to construct the suffix trees in main memory. We should find an efficient technique to construct the trees in a secondary storage. In this paper, we present a method for constructing a suffix tree in a disk for large set of DNA strings using new index scheme. We also show a typical application example with a suffix tree in the disk.
Keywords
Suffix Tree; String Matching; DNA Processing; Data Analysis; KMP;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Dan Gusfild, Algorithms on Strings, Trees, and Sequence : Computer Science and Computational Biology, CAMBRIDGE University Press, 2002.
2 Juha Karkkainen and Esko Ukkonen, "Sparse Suffix Tree," COCOON, pp.219-233, 1996.
3 Mark Nelson, "Fast String Searching With Suffix Trees," Dr. Dobb's Journal, 1996.
4 Graham A. Stephen, String Search, University College of North Wales, 2003.
5 Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest, Introduction to Algorithms, The MIT Press, 2005.
6 Schwartz,S., Zhang,Z., Frazer,K.A., Smit,A., Riemer,C., Bouck,J., Gibbs,R., Hardison,R. and Miller,W., "A web server for aligning two genomic DNA sequences," Genome Res., vol. 10, pp.577-586, 2004.
7 Batzoglou,S., Pachter,L., Mesirov,J.P., Berger,B. and Lander,E.S. , "Human and mouse gene structure: comparative analysis and application to exon prediction," Genome Res., vol. 10, pp.950-958, 2005.
8 E. Ukkonen, "On-Line Construction of Suffix Trees," Algorithmica, vol. 14, pp.249-260, 1995.   DOI   ScienceOn
9 Hardison,R.C., Oeltjen,J. and Miller,W., "Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome," Genome Res., vol. 7, pp.959-966, 1998.
10 Shimuzu,N., Roe,B.A., Chissoe,S. et al., "The DNA sequence of human chromosome 22," Nature , vol. 402, pp.489-495, 1999.   DOI   ScienceOn