A New merging Algorithm for Constructing suffix Trees for Integer Alphabets

Kim, Dong-Kyu;Sim, Jeong-Seop;Park, Kun-Soo;

한국정보과학회논문지:시스템및이론 (Journal of KIISE:Computer Systems and Theory)

제29권2호
/
Pages.87-93
/
2002
/
1229-683X(pISSN)

한국정보과학회 (Korean Institute of Information Scientists and Engineers)

정수 문자집합상의 접미사트리 구축을 위한 새로운 합병 알고리즘

A New merging Algorithm for Constructing suffix Trees for Integer Alphabets

김동규 (부산대학교 전자전기정보컴퓨터공학부) ;
심정섭 (서울대학교 컴퓨터공학부) ;
박근수 (서울대학교 컴퓨터공학부)

Kim, Dong-Kyu (Dept.of Electronics Electric Information Computer Engineering, Busan National University) ;
Sim, Jeong-Seop ;
Park, Kun-Soo

발행 : 2002.02.01

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

주어진 스트링 S의 접미사트리 $T_s$를 구축하기 위하여 , 먼저 홀수위치들에 대한 접미사트리 $ T_0$를 제귀적으로 구축하고 짝수위치들에 대한 접비사트리 $T_e$를 $ T_o$/로 부터 구축한 다음 $ T_o$와 $T_e$를 합병하여 $T_s$를 구축하는 새로운 방식이 사용되고 있다. 인덱스자료구조에 관련된 문제들 중 정수 문자집합상의 접미사트리를 선형시간에 구축하는 문제는 오랫동안 미해결문제로 남아 있었다. Farach은 이 방식을 적용하여 처음으로 성형시간이 소요되는 알고리즘을 제시하였다. 이 알고리즘은 중 가장 어려운 곳은 합병하는 부분이다. 본 논문에서는 BFS(breadth-first search)에 기반하는 새로운 합병알고리즘을 제안한다. 제안된 합병알고리즘은 Farach의 DFS(depth-first search) 방식보다 개념적으로 단순하게 동작하므로 다른 응용의로 쉽게 확장될수 있다.

A new approach of constructing a suffix tree $T_s$for the given string S is to construct recursively a suffix tree $ T_0$ for odd positions construct a suffix tree $T_e$ for even positions from $ T_o$ and then merge $ T_o$ and $T_e$ into $T_s$ To construct suffix trees for integer alphabets in linear time had been a major open problem on index data structures. Farach used this approach and gave the first linear-time algorithm for integer alphabets The hardest part of Farachs algorithm is the merging step. In this paper we present a new and simpler merging algorithm based on a coupled BFS (breadth-first search) Our merging algorithm is more intuitive than Farachs coupled DFS (depth-first search ) merging and thus it can be easily extended to other applications.

키워드

참고문헌

E.M. McCreight, A space-economical suffix tree construction algorithms, J. ACM 23 (1976), 262-272 https://doi.org/10.1145/321941.321946
P. Weiner, Linear pattern matching algorithms, Proc. 14th IEEE Symp. Switching and Automata Theory (1973), 1-11
M.T. Chen and J. Seiferas, Efficient and elegant subword tree construction, In A. Apostolico and Z.Galil, editors, Combinatorial Algorithms on Words, NATO ASI Series F: Computer and System Sciences (1985)
E. Ukkonen, 'On-line construction of suffix trees,' Algorithmica 14, pp. 353-364, 1993 https://doi.org/10.1007/BF01206331
Z. Galil, Open problems in stringology, In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, NATO ASI Series F: Computer and System Sciences(1985)
S. Kosaraju and A. Delcher, Large-scale assembly of dna strings and space-efficient construction of suffix trees, ACM Symp. Theory of Computing (1995), 169-177 https://doi.org/10.1145/225058.225108
S. Kosaraju and A. Delcher, Large-scale assembly of dna strings and space-efficient construction of suffix trees (corrections), ACM Symp. Theory of Computing (1996) https://doi.org/10.1145/237814.250975
M. Farach and S. Muthhukrishnan, Optimal logarithmic time randomized suffix tree construction, Int. Colloq. Automata Languages and Programming (1996), 550-561
M. Farach, Optimal suffix tree construction with large alphabets, IEEE Symp. Found. Computer Science (1991), 137-143 https://doi.org/10.1109/SFCS.1997.646102
R. Hariharan, Optimal parallel suffix tree construction, IEEE Symp. Found. Computer Science (1994), 290-299 https://doi.org/10.1145/195058.195162
S.C. Sahinalp and U. Vishkin, Symmetry breaking for suffix tree construction, IEEE Symp Found. Computer Science. (1994), 300-309 https://doi.org/10.1145/195058.195164
R.M. Karp and M.O. Rabin, Efficient randomized pattern-matching algorithms, IBM Journal of Research and Development 31 (1987), 249-260 https://doi.org/10.1147/rd.312.0249
D. Harel and R.E. Tarjan, Fast algorithms for finding nearest common ancestors, SIAM J. Comput. l3(1984), 338-355 https://doi.org/10.1137/0213024
B. Schieber and U. Vishkin, On finding lowest common ancestors: simplification and parallelization, SIAM J. Comput. 17, (1988), 1253-1262 https://doi.org/10.1137/0217079
D.K. Kim and K. Park, Linear-time construction of two-dimensional suffix trees, Int. Colloq. Automata Languages and programming (1999), 463-472

한국정보과학회논문지:시스템및이론 (Journal of KIISE:Computer Systems and Theory)

정수 문자집합상의 접미사트리 구축을 위한 새로운 합병 알고리즘

A New merging Algorithm for Constructing suffix Trees for Integer Alphabets

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)