• Title/Summary/Keyword: XML Index

Search Result 121, Processing Time 0.038 seconds

A Path Partitioning Technique for Indexing XML Data (XML 데이타 색인을 위한 경로 분할 기법)

  • 김종익;김형주
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.320-330
    • /
    • 2004
  • Query languages for XML use paths in a data graph to represent queries. Actually, paths in a data graph are used as a basic constructor of an XML query. User can write more expressive Queries by using Patterns (e.g. regular expressions) for paths. There are many identical paths in a data graph because of the feature of semi-structured data. Current researches for indexing XML utilize identical paths in a data graph, but such an index can grow larger than source data graph and cannot guarantee efficient access path. In this paper we propose a partitioning technique that can partition all the paths in a data graph. We develop an index graph that can find appropriate partitions for a path query efficiently. The size of our index graph can be adjusted regardless of the source data. So, we can significantly improve the cost for index graph traversals. In the performance study, we show our index much faster than other graph based indexes.

The Path Inverted Index Technique for XML Document Retrieval (XML 문서 검색을 위한 경로 역 색인 기법)

  • Moon, Kyung-Won;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.17D no.2
    • /
    • pp.103-110
    • /
    • 2010
  • Recently, many XML document management systems using the advantage of RDBMS have been actively developed for the storage, processing and retrieval of XML documents. However, fractional pattern-matching query such as the LIKE operations cannot take the advantage of the index of RDBMS because these operations have deteriorated retrieval performance through its inefficient comparison processing. The hierarchical XML storage technique which stores XML documents in RDBMS efficiently, and the path inverted index technique are proposed in this paper. It regards the element of an XML document as a keyword, and focuses on organizing a posting file with path identifiers and sequences to reduce the retrieval time of path based query. Through simulations, our methods have shown about 60% better performance than the conventional method using RDBMS in searching.

Design and Implementation of an XML Document Management System Based on $O_2$ ($O_2$기반의 XML 문서관리 시스템 설계 및 구현)

  • 유재수
    • The Journal of Information Technology and Database
    • /
    • v.7 no.1
    • /
    • pp.27-39
    • /
    • 2000
  • In this paper, we design and implement a XML management system based on OODBMS that supports structured information retrieval of XML documents. We also propose an object oriented modeling to store and fetch XML documents, to manage image data, and to support versioning for the XML document management system(XMS). The XMS consists of a repository manager that maintains the interfaces for external application programs, a XML instance storage manager that stores XML documents in the database, a XML instance manager that fetches XML documents stored in the database, a XML index manager that creates index for the structure information and the contents of documents, and a query processor that processes various queries.

  • PDF

An Efficient BitmapInvert Index based on Relative Position Coordinate for Retrieval of XML documents (효율적인 XML검색을 위한 상대 위치 좌표 기반의 BitmapInvert Index 기법)

  • Kim, Tack-Gon;Kim, Woo-Saeng
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.1 s.307
    • /
    • pp.35-44
    • /
    • 2006
  • Recently, a lot of index techniques for storing and querying XML document have been studied so far and many researches of them used coordinate-based methods. But update operation and query processing to express structural relations among elements, attributes and texts make a large burden. In this paper, we propose an efficient BitmapInvert index technique based on Relative Position Coordinate (RPC). RPC has good preformance even if there are frequent update operations because it represents relationship among parent node and left, right sibling nodes. BitmapInvert index supports tort query with bitwise operations and does not casue serious performance degradations on update operations using PostUpdate algerian. Overall, the performance could be improved by reduction of the number of times for traversing nodes.

PIX: Partitioned Index for Keyword Search over XML Documents (PIX: XML문서 검색을 위한 색인 분할 기법)

  • Lee Hongrae;Lee Hyungdong;Yoo Sangwon;Kim Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.710-720
    • /
    • 2004
  • As XML documents have much richer information than plain texts, we can perform very elaborated, fine-grained search which was difficult in past years. However, as the cost of finer grained element level search is very high, the processing overhead has become a new challenge. We propose an inverted index structure called PIX, which reduces the number of elements processed by partitioning elements according to their match potentiality. We choose a base level and partition elements according to whether they have possibility of having a common ancestor higher than the level. We also propose partition merging technique by which we can get same results as unpartitioned case. Our experimental results show that the index partitioning strategy can reduce processing time considerably.

Partitioning and Merging an Index for Efficient XML Keyword Search (효율적 XML키워드 검색을 인덱스 분할 및 합병)

  • Kim, Sung-Jin;Lee, Hyung-Dong;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.33 no.7
    • /
    • pp.754-765
    • /
    • 2006
  • In XML keyword search, a search result is defined as a set of the smallest elements (i.e., least common ancestors) containing all query keywords and a granularity of indexing is an XML element instead of a document. Under the conventional index structure, all least common ancestors produced by the combination of the elements, each of which contains a query keyword, are considered as a search result. In this paper, to avoid unnecessary operations of producing the least common ancestors and reduce query process time, we describe a way to construct a partitioned index composed of several partitions and produce a search result by merging those partitions if necessary. When a search result is restricted to be composed of the least common ancestors whose depths are higher than a given minimum depth, under the proposed partitioned index structure, search systems can reduce the query process time by considering only combinations of the elements belonging to the same partition. Even though the minimum depth is not given or unknown, search systems can obtain a search result with the partitioned index, which requires the same query process time to obtain the search result with non-partitioned index. Our experiment was conducted with the XML documents provided by the DBLP site and INEX2003, and the partitioned index could reduce a substantial amount of query processing time when the minimum depth is given.

Directory Index : Effective Index Structure for Query Processing of XML Data stored in RDBMS (디렉토리 인덱스 : 관계형 데이타베이스 시스템에서 XML 데이타의 효과적인 질의 처리를 위한 인덱스 구조)

  • 백성호;이석호
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.22-24
    • /
    • 2002
  • XML이 웹상에서 데이타 교환의 표준으로 채택되면서 XML 데이타를 관계형 데이타베이스를 이용하여 저장하고 처리하는 것이 많이 연구되고 있다. 본 연구에서는 관계형 데이타베이스에 저장되어 있는 XML 데이타의 효과적인 질의 처리에 사용할 수 있는 인덱스 구조로서 디렉토리 인덱스를 제안한다. 디렉토리 인덱스는 정규 경로식 처리에 있어서 비트맵을 이용하여 조인 연산을 크게 줄여 처리 시간이 빠르며 인덱스의 갱신에도 효과적으로 대처할 수 있다.

  • PDF

A Type Hierarchy Index for XML Databases with XML Schema (XML Schema에 의한 XML 데이타베이스의 타입 상속 색인구조)

  • Lim Yun-Ju;Lee Jong-Hak
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.11a
    • /
    • pp.85-88
    • /
    • 2004
  • 최근 XML데이터베이스는 웹의 발전과 더불어 광범위한 인터넷의 자원 공유에 크게 기여하고 있으며 이러한 자원 공유를 위해서는 XML데이타베이스에 대한 구조적 정의로 타입 상속 구조를 가지는 XML Schema를 사용한다. 그러므로 XML Schema를 따르는 XML데이타베이스에 대한 효율적인 색인기법에 대한 연구가 필요하다. 따라서 본 논문에서는 기존의 다차원 색인구조와 사전에 분석한 사용자 질의 패턴에 대한 정보를 이용하여 주어진 질의들에 의해서 액세스되는 색인 페이지의 평균 개수가 최소가 되게 하는 최적의 이차원 타입 색인 구조를 구성 할 수 있는 2D-THI를 제안한다. 제안한 2D-THI의 성능을 비교 평가하기 위해서 기존의 객체지향 데이터베이스에서 클래스 상속에 대한 색인구조로 널리 사용되고 있는 CH-index와 CG-tree를 XML데이타베이스에 적용하여 이들과 2D-THI를 비용모델을 통해서 비교 분석한다. 그 결과로 본 논문에서 제안한 2D-THI로서 다양한 질의 패턴에 대해서 최적의 색인구조를 구성할 수 있음을 보인다.

  • PDF

Hippocratic XML Databases: A Model and Access Control Mechanism (히포크라테스 XML 데이터베이스: 모델 및 액세스 통제 방법)

  • Lee Jae-Gil;Han Wook-Shin;Whang Kyu-Young
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.684-698
    • /
    • 2004
  • The Hippocratic database model recently proposed by Agrawal et al. incorporates privacy protection capabilities into relational databases. Since the Hippocratic database is based on the relational database, it needs extensions to be adapted for XML databases. In this paper, we propose the Hippocratic XML database model, an extension of the Hippocratic database model for XML databases and present an efficient access control mechanism under this model. In contrast to relational data, XML data have tree-like hierarchies. Thus, in order to manage these hierarchies of XML data, we extend and formally define such concepts presented in the Hippocratic database model as privacy preferences, privacy policies, privacy authorizations, and usage purposes of data records. Next, we present a new mechanism, which we call the authorization index, that is used in the access control mechanism. This authorization index, which is Implemented using a multi-dimensional index, allows us to efficiently search authorizations implied by the authorization granted on the nearest ancestor using the nearest neighbor search technique. Using synthetic and real data, we have performed extensive experiments comparing query processing time with those of existing access control mechanisms. The results show that the proposed access control mechanism improves the wall clock time by up to 13.6 times over the top-down access control strategy and by up to 20.3 times over the bottom-up access control strategy The major contributions of our paper are 1) extending the Hippocratic database model into the Hippocratic XML database model and 2) proposing an efficient across control mechanism that uses the authorization index and nearest neighbor search technique under this model.

Multi-Path Index Scheme for the Efficient Retrieval of XML Data (XML 데이타의 효과적인 검색을 이한 다중 경로 인덱스)

  • Song, Ha-Joo;Kim, Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.1
    • /
    • pp.12-23
    • /
    • 2001
  • Extended path expressions are used to denote multiple paths concisely by using '$\ast$' character. They are convenient for expressing OQL queries to retrieve XML data stored in OODBs. In this paper, we propose a multi-path index scheme as a new index scheme to efficiently process queries with extended path expressions. Our proposed index scheme allocates a unique path identifier for every possible single path in an extended path expression and provides functionalities of both a single path indexing and multiple path indexing through the composition of index key and path identifier while using only a index structure. The proposed index scheme provides better performance than single-path index schemes, and is practical since it can be implemented by little modification of leaf records of a B+-tree index.

  • PDF