[KSCI] Korea Science Citation Index Service

Partitioning and Merging an Index for Efficient XML Keyword Search

Kim, Sung-Jin (서울대학교 전기컴퓨터공학부)
Lee, Hyung-Dong (서울대학교 전기컴퓨터공학부)
Kim, Hyoung-Joo (서울대학교 전기컴퓨터공학부)

Publication Information

Journal of KIISE:Databases / v.33, no.7, 2006 , pp. 754-765 More about this Journal

Abstract

In XML keyword search, a search result is defined as a set of the smallest elements (i.e., least common ancestors) containing all query keywords and a granularity of indexing is an XML element instead of a document. Under the conventional index structure, all least common ancestors produced by the combination of the elements, each of which contains a query keyword, are considered as a search result. In this paper, to avoid unnecessary operations of producing the least common ancestors and reduce query process time, we describe a way to construct a partitioned index composed of several partitions and produce a search result by merging those partitions if necessary. When a search result is restricted to be composed of the least common ancestors whose depths are higher than a given minimum depth, under the proposed partitioned index structure, search systems can reduce the query process time by considering only combinations of the elements belonging to the same partition. Even though the minimum depth is not given or unknown, search systems can obtain a search result with the partitioned index, which requires the same query process time to obtain the search result with non-partitioned index. Our experiment was conducted with the XML documents provided by the DBLP site and INEX2003, and the partitioned index could reduce a substantial amount of query processing time when the minimum depth is given.

Keywords

XML(eXtensible Markup Language); XML Keyword Search; Partitioned Index;

Citations & Related Records

Reference

1	WWW Consortium, http://www.w3.org/XML/
2	Salton, G., and McGrill, M.J., 'Introduction to Modern Information Retrieval,' McGraw-Hill, New York, 1983
3	DBLP, http://www.informatik.uni-trier.de/~ley/db/index.html
4	Initiative for the evaluation of XML retrieval, http://inex.is.informatik.uni-duisburg.de:2003/
5	BerkeylyDB, http://www.sleepycat.com
6	Theobald, A., and Weikum, G., 'Adding Relevance to XML,' In Proceedings of the 3th International Workshop on the Web and Databases, pp.105-124, 2000
7	Xu, Y., and Papakonstantinou, Y., 'Efficient Keyword Search for Smallest LCAs in XML Databases,' In Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp.527-538, 2005 DOI
8	Mignet, L., Barbosa, D., and Veltri, P., 'The XML Web: a First Study,' In Proceedings of the 12th International World Wide Web Conference, pp.500-510, 2003 DOI
9	Anh, V., Krester, O., and Moffat, A., 'Vector-Space Ranking with Effective Early Termination,' In Proceedings of the 24th Annual International ACM SIGIR Confenfrence on Research and Development in Information Retrieval, pp.35-42, 2001 DOI
10	Florescu, D., Kossmann, D., and Manolescu, L., 'Integrating Keyword Search into XML Query Processing,' Computer Networks, Vol.33, No.1-6, pp.119-135, 2000
11	Moffat, A., and Zobel, J., 'Self-Indexing Inverted Files for Fast Text Retrieval,' ACM Transactions on Database Systems, Vol.14, No.4, pp.349-379, 1996 DOI ScienceOn
12	Putz, S., Using a Relational Database for an Inverted Text Index. XEROX Technical Report '91
13	Carmel, D., Maarek, Y,S., Mandelbrod, M., Mass, Y., and Soffer, A., 'Searching XML Documents via XML Fragments,' In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 151-158, 2003 DOI
14	Cohen, S., Mamou, J., Kanza, Y., and Sagiv, Y., 'XSEarch: A Semantic Search Engine for XML,' In Proceedings of 29th International Conference on Very Large Data Bases, pp.45-56, 2003
15	Guo, L., Shao, F., Botev, C., and Shanmugasundaram, J., 'XRANK: Ranked Keyword Search over XML Documents,' In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp.16-27, 2003 DOI
16	Hritidis, V., Papakonstantinou, P., and Balmin, A., 'Keyword Proximity Search on XML Graph,' In Proceedings of the 19th International Conference on Data Engineering, pp.367-378, 2003

KSCI

Partitioning and Merging an Index for Efficient XML Keyword Search 효율적 XML키워드 검색을 인덱스 분할 및 합병

Partitioning and Merging an Index for Efficient XML Keyword Search