Browse > Article

A Ranking Technique of XML Documents using Path Similarity for Expanded Query Processing  

Kim, Hyun-Joo (삼성전자 정보통신총괄)
Park, So-Mi (서강대학교 컴퓨터공학과)
Park, Seog (서강대학교 컴퓨터공학과)
Abstract
XML is broadly using for data storing and processing. XML is specified its structural characteristic and user can query with XPath when information from data document is needed. XPath query can process when the tern and structure of document and query is matched with each other. However, nowadays there are lots of data documents which are made by using different terminology and structure therefore user can not know the exact idea of target data. In fact, there are many possibilities that target data document has information which user is find or a similar ones. Accordingly user query should be processed when their term usage or structural characteristic is slightly different with data document. In order to do that we suggest a XML document ranking method based on path similarity. The method can measure a semantic similarity between user query and data document using three steps which are position, node and relaxation factors.
Keywords
Information Retrieval System; XML Document Searching; Ranking System; Web Database;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. Kanza, Y. Sagiv, "Flexible Queries over Semistructured Data," Proc. of 12th ACM SIGMODSIGACT- SIGART symposium on Principles of database systems, pp.40-51, 2001.
2 C. X. Chen, G. A. Mihaila, S. Padmanabhan, and I. M. Rouvellou, "Query Translation Scheme for Heterogeneous XML Data Sources," Proc. of 7th annual ACM international workshop on Web information and data management, pp.31-38, 2005.
3 S. Amer-Yahia, S. Cho, and D. Srivastava, "Tree Pattern Relaxation," Proc. 8th International Conference on Extending Database Technology: Advances in Database Technology, pp.496-513, 2002.
4 I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang, "Storing and querying ordered XML using a relational database system," Proc. of the 2002 ACM SIGMOD international conference on Management of data, pp.204-215, 2002.
5 WordNet – a Lexical Database for the English Language. http://www.cogsci.princeton.edu/wn/.
6 toExcel, Extensible Markup Language (Xml) 1.0 Specifications: From the W3c Recommendations, iUniverse, Incorporated, 2000.
7 W3C. XML path language (XPath): Version 2.0. http://www.w3.org/TR/xpath20/.