Browse > Article

An Index Data Structure for String Search in External Memory  

Na, Joong-Chae (서울대학교 전기, 컴퓨터공학부)
Park, Kun-Soo (서울대학교 전기, 컴퓨터공학부)
Abstract
We propose a new external-memory index data structure, the Suffix B-tree. The Suffix B-tree is a B-tree in which the key is a string like the String B-tree. While the node in the String B-tree is implemented with a Patricia trio, the node in the Suffix B-tree is implemented with an array. So the Suffix B-tree is simpler and easier to be Implemented than the String B-tree. Nevertheless, the branching algorithm of the Suffix B-tree is as efficient as that of the String B-tree. Consequently, the Suffix B-tree takes the same worst-case disk accesses as the String B-tree to solve the string matching problem, which is fundamental and important in the area of string algorithms.
Keywords
String algorithms; external memory algorithms; index data structures; pattern matching;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R. Bayer, C. McCreight, 'Organization and maintenance of large ordered indexes,' Acta Informatica 1, 3. pp. 173-189, 1972   DOI
2 R. Bayer, K. Unterauer, 'Prefix B-trees,' ACM Trans. Database System 2, 1, pp. 11-26, 1977   DOI
3 P. Weiner, 'Linear pattern matching algorithms,' Proceedings of the 14th IEEE symposium on Switching and Automata Theory, pp. 1-11, 1973
4 U. Manber, G. Myers, 'Suffix arrays: a new method for on-line string searches,' SIAM J. Computing 22, pp. 935-948, 1993   DOI   ScienceOn
5 E. Ukkonen, D. Wood, 'Approximate string matching with suffix automata,' Algorithmica 10, pp. 353-364, 1993   DOI
6 D. R. Morrison, 'PATRICIA: Practical algorithm to retrieve information coded in alphanumeric,' J. ACM 15, pp. 514-534, 1968   DOI
7 R. W. Irving, L. Love, 'The suffix binary search tree and suffix AVL tree,' to appear in Journal of Discrete Algorithms   DOI   ScienceOn
8 D. Gusfield, Algorithms on Strings, Tree, and Sequences, Cambridge University Press, Cambridge, 1997
9 E. M. McCreight, 'A space-economical suffix tree construction algorithms,' J. ACM 23, pp. 262-272, 1976   DOI   ScienceOn
10 E. Ukkonen, 'On-line construction of suffix trees,' Algorithmica 14, pp. 353-364, 1993   DOI
11 D. Comer, 'The ubiquitous B-trees,' Computing Surveys 11, pp. 121-137, 1979   DOI   ScienceOn
12 P. Ferragina, R. Grossi, 'The string B-tree: a new data structure for string search in external memory and its applications,' JACM 46(2), pp. 236-280, 1999   DOI   ScienceOn
13 T. Kasai, G. Lee, H. Arimura, S. Arikawa, K. Park, 'Linear-time longest-common-prefix computation in suffix arrays and its applications,' 12th Symposium on Combinatorial Pattern Matching, pp. 181-192, 2001
14 J S. Vitter, 'External memory algorithms and data structures: dealing with massive data,' ACM Computing Surveys 33, pp. 209-271, 2001   DOI   ScienceOn
15 N. Prywes, H. Gray, 'The organization of a Multilist-type associative memory ,' IEEE Trans. on Communication and Electronics 68, pp. 488-492, 1963