Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2010.17D.2.103

The Path Inverted Index Technique for XML Document Retrieval  

Moon, Kyung-Won ((주)NHN 검색개발센터)
Hwang, Byung-Yeon (가톨릭대학교 컴퓨터정보공학부)
Abstract
Recently, many XML document management systems using the advantage of RDBMS have been actively developed for the storage, processing and retrieval of XML documents. However, fractional pattern-matching query such as the LIKE operations cannot take the advantage of the index of RDBMS because these operations have deteriorated retrieval performance through its inefficient comparison processing. The hierarchical XML storage technique which stores XML documents in RDBMS efficiently, and the path inverted index technique are proposed in this paper. It regards the element of an XML document as a keyword, and focuses on organizing a posting file with path identifiers and sequences to reduce the retrieval time of path based query. Through simulations, our methods have shown about 60% better performance than the conventional method using RDBMS in searching.
Keywords
XML; RDBMS; Document Retrieval; Inverted Index Method; Path Query;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom, "Lore: A Database Management System for Semistructured Data," ACM SIGMOD Record, Vol.26, No.3, pp.54-66, 1997.   DOI   ScienceOn
2 R. Goldman and J. Widom, "Dataguides: Enabling Query Formulation and Optimization in Semistructured Databases," Proc. of the 23rd Int'l Conf. on Very Large Databases, pp.436-445, 1997.
3 C. Chung, J. Min, and K. Shim, "APEX: An Adaptive Path Index for XML Data," Proc. of the Int'l Conf. on ACM SIGMOD, pp.121-132, Madison, Wisconsin, June, 2002.   DOI
4 R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes, "Exploiting Local Similarity for Indexing Paths in Graph-Structured Data," Proc. of the 18th IEEE Int'l. Conf. on Data Engineering, pp.129-140, 2002.   DOI
5 B. F. Cooper, N. Sample, M. J. Franklin, G. R. Hjaltason, and M. Shadmon, "A Fast Index for Semistructured Data," Proc. of the 27th Int'l Conf. on Very Large Databases, pp.341-350, Rome, Italy, Sep., 2001.
6 J. Yoon, V. Raghavan, V. Chakilam, and L. Kerschberg, "BitCube: A Three-Dimensional Bitmap Indexing for XML Documents," J. of Intelligent Information Systems, Vol.17, pp. 241-254, 2001.   DOI   ScienceOn
7 J. Yoon, V. Raghavan, and V. Chakilam, "BitCube: Clustering and Statistical Analysis for XML Documents," Proc. of the 13th Int'l Conf. on Scientific and Statistical Database Management, Virginia, 2001.
8 M. Yoshikawa and T. Amagasa, "XRel: A Path-Based Approach to Storage and Retrieval of XML Documents using Relational Databases," ACM Transactions on Internet Technology, Vol.1, No.1, pp.110-141, 2001.   DOI
9 이재민, 황병연, "xPlaneb: XML 검색을 위한 3차원 비트맵 인 덱스," 정보과학회논문지, 31권, 3호, pp.331-339, 2004.   과학기술학회마을
10 D. Hong, "On supporting full-text retrievals in XML query," International Journal of Fuzzy Logic and Intelligent Systems Vol.7, No.4, pp.274-278, 2007.   과학기술학회마을   DOI   ScienceOn
11 H. Jiang, H. Lu, W. Wang, and J. X. Yu, "Path Materialization Revisited: An Efficient Storage Model for XML Data," Proc. of the 13th Australian Database Conference, pp.85-94, Melbourne, Australia, Jan., 2002.
12 정민경, 홍동권, 남재열, "XML을 RDBMS에 저장하기 위한 Analyzer 설계 및 구현," 한국정보과학회 2005 한국컴퓨터종합 학술대회 논문집, pp.148-150, 2005.   과학기술학회마을
13 김재훈, 여준호, 이규철, "메모리-상주 관계형 DBMS에서 XML 데이터 처리를 위한 효율적인 저장 기법," 한국정보과학회 2008 가을 학술발표논문집 제35권 제2호, pp.55-59, 2008.
14 http://about.reuters.com/newsml
15 R. Krishnamurthy, R. Kaushik, and J. Naughton. "XML-to- SQL Query Translation Literature: The State of the Art and Open Problems," Proc. of the 1st Int'l XML Database Symposium, pp.1-18, Berlin, Germany, Sep., 2003.
16 http://us.imdb.com/top_250_films