• Title/Summary/Keyword: XML Index

Search Result 121, Processing Time 0.024 seconds

A Multi-level Inverted Index Technique for Structural Document Search (구조화 문서 검색을 위한 다단계 역색인 기법)

  • Kim, Jong-Ik
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.355-364
    • /
    • 2008
  • In general, we can use an inverted index for retrieving element lists from structured documents. An inverted index can retrieve a list of elements that have the same tag name. In this approach, however, the cost of query processing is linear to the length of a path query because all the structural relationships (parent-child and ancestor-descendant) should be resolved by structural join operations. In this paper, we propose an inverted index technique and a novel structural join technique for accelerating XML path query evaluation. Our inverted index can retrieve element lists for path segments in a parent-child relationship. Our structural join technique can handle lists of element pairs while the existing techniques handle lists of elements. We show through experiments that these two proposed techniques are integrated to accelerate evaluation of XML path queries.

XML Vicw Indexing (XML 뷰 인덱싱)

  • 김영성;강현철
    • Journal of KIISE:Databases
    • /
    • v.30 no.3
    • /
    • pp.252-272
    • /
    • 2003
  • The view mechanism provides users with appropriate portions of database through data filtering and integration. In the Web era where information proliferates, the view concept is also useful for XML, a future standard for data exchange on the Web. This paper proposes a method of implementing XML views called XML view indexing, whereby XML view xv is represented as an XML view index(XVI) which is a structure containing the identifiers of xv's underlying XML elements as well as the information on xv. Since XVI for xv stores just the identifiers of the XML elements but not the elements themselves, when a user requests to retrieve xv, its XVI should be materialized against xv's underlying XML documents. Also an efficient algorithm to incrementally maintain consistency of XVI given a update of xv's underlying XML documents is required. This paper proposes and implements data structures and algorithms for XML view indexing. The performance experiments on XML view indexing reveal that it outperforms view recomputation for repeated accesses to the view, and requires as much as about 30 times less storage space compared to XML view materialization though the latter takes less time for repeated accesses to the view due to no need of materialization.

Update conscious and depth insensitive inverted indexes for XML full-text queries (XML 문서의 변경을 고려한 XML 전문 검색 역인덱스)

  • Kwon, Guk-Bong;Hong, Dong-Kweon;Kim, Kweon-Yang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.81-84
    • /
    • 2004
  • XML 문서는 관계형 테이블과는 달리 문서의 구조가 매우 복잡하고 불규칙하여 부분적인 정보를 최대한 활용하는 전문 검색이 일반적인 구조적 검색보다 더 중요한 역할을 한다. XML 문서는 계층이 있으므로 계층을 사용하는 전문 검색 연산은 계층을 제공함으로써 검색 공간을 줄여서 검색의 정확성과 효율성을 훨씬 더 높일 수 있다. 전문 검색 연산을 효과적으로 지원하기 위한 방법으로는 역인덱스를 (inverted index) 사용하는 것이 가장 일반적인 방법이다. 지금까지의 전문 검색을 위한 XML 문서의 구조 정보를 표현, 저장하는 방법들은 문서의 내용이 변경되지 않는 정적 문서(static documents)만을 고려하여 왔다. 이 방법들은 문서가 동적으로 변화할 경우 저장된 문서의 구조 정보 중에서 많은 부분을 다시 표현해야 하는 비효율적인 면이 있다. 본 논문은 XML 문서의 동적인 변화를 지원하면서 동시에 복잡한 XML 전문 검색을 지원하기 위한 방법으로 경로 스트링을 사용하는 효율적인 역 인덱스 구축 기법을 제안하고 제안하는 방법이 복잡한 문서의 검색과 문서의 동적인 변화를 효율적으로 검색할 수 있음을 보인다.

  • PDF

k-Bitmap Clustering Method for XML Data based on Relational DBMS (관계형 DBMS 기반의 XML 데이터를 위한 k-비트맵 클러스터링 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.845-850
    • /
    • 2009
  • Use of XML data has been increased with growth of Web 2.0 environment. XML is recognized its advantages by using based technology of RSS or ATOM for transferring information from blogs and news feed. Bitmap clustering is a method to keep index in main memory based on Relational DBMS, and which performed better than the other XML indexing methods during the evaluation. Existing method generates too many clusters, and it causes deterioration of result of searching quality. This paper proposes k-Bitmap clustering method that can generate user defined k clusters to solve above-mentioned problem. The proposed method also keeps additional inverted index for searching excluded terms from representative bits of k-Bitmap. We performed evaluation and the result shows that the users can control the number of clusters. Also our method has high recall value in single term search, and it guarantees the searching result includes all related documents for its query with keeping two indices.

Storage and Retrieval of XML Documents Without Redundant Path Information (경로정보의 중복을 제거한 XML 문서의 저장 및 질의처리 기법)

  • Lee Hiye-Ja;Jeong Byeong-Soo;Kim Dae-Ho;Lee Young-Koo
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.663-672
    • /
    • 2005
  • This Paper Proposes an approach that removes the redundancy of Path information and uses an inverted index, as an efficient way to store a large volume of XML documents and to retrieve wanted information from there. An XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. The existing methods using path information store data for all element paths, which cause retrieval performance to be decreased with increased data volume. Our approach stores only data for leaf element path excluding internal element paths. As the inverted index is made by the leaf element path only, the number of posting lists by key words become smaller than those of the existing methods. For the storage and retrieval of U data, our approach doesn't require the XML schema information of XML documents and any extension of relational database. We demonstrate the better performance of on approach than the existing approaches within the scope of our experiment.

A XML Instance Repository Model based on the Edge-Labeled Graph (Edge-Labeled 그래프 기반의 XML 인스턴스 저장 모델)

  • Kim Jeong-Hee;Kwak Ho-Young
    • Journal of Internet Computing and Services
    • /
    • v.4 no.6
    • /
    • pp.33-42
    • /
    • 2003
  • A XML Instance repository model based on the Edge-Labeled Graph is suggested for storing the XML instance in Relational Databases, This repository model represents the XML instance as a data graph based on the Edge-Labeled Graph, extracts the defined value based on the structure of data path, element, attribute, and table index table presented as database schema, and stores these values using the Mapper module, In order to support querry, XML repository model offers the module translating XQL which is a query language under XPATH to SQL, and has DBtoXML generator module restoring the stored XML instance. As a result, it is possible to represent the storage relationship between the XML instances and the proposed repository model in terms of Graph-based Path, and it shows the possibility of easy search of specific element and attribute information.

  • PDF

XML Repository Model based on the Edge-Labeled Graph (Edge-Labeled Graph를 적용한 XML 저장 모델)

  • 김정희;곽호영
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.5
    • /
    • pp.993-1001
    • /
    • 2003
  • A RDB Storage Model based on the Edge-Labeled Graph is suggested for store the XML instance in Relational Databases(RDB). The XML instance being stored is represented by Data Graph based on the Edge-Labeled Graph. Data Path Table, Element, Attribute, and Table Index Table values are extracted. Then Database Schema is defined, and the extracted values are stored using the Mapper. In order to support querry, Repository Model offers the translator translating XQL which is used as query language under XPATH, into SQL. In addition, it creates DBtoXML generator restoring the stored XML instance. As a result, storage relationship between the XML instance and proposed model structure can be expressed in terms of Graph-based Path, and it shows the possibility of easy search of random Element and Attribute information.

Techniques of XML Fragment Stream Organization for Efficient XML Query Processing in Mobile Clients (이동 클라이언트에서 효율적인 XML 질의 처리를 위한 XML 조각 스트림 구성 기법)

  • Ryu, Jeong-Hoon;Kang, Hyun-Chul
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.75-94
    • /
    • 2009
  • Since XML emerged as a standard for data exchange on the web, it has been established as a core component in e-Commerce and efficient query processing over XML data in ubiquitous computing environment has been also receiving much attention. Recently, the techniques were proposed whereby an XML document is fragmented into XML fragments to be streamed and the mobile clients receive the stream while processing queries over it. In processing queries over an XML fragment stream, the average access time significantly depends on the order of fragments in the stream. As such, for query performance, an efficient organization of XML fragment stream is required as well as the indexing for energy-efficient query processing due to the reduction of tuning time. In this paper, a technique of XML fragment stream organization based on query frequencies, fragment size, fragment access frequencies, and an active XML-based indexing scheme are proposed. Through implementation and performance experiments, our techniques were shown to be efficient compared with the conventional XML fragment stream organizations.

  • PDF

An XML-based Metadata Engine Design for Effective Retrieval in Video Recording System (동영상 저장 시스템에서 효율적인 검색을 위한 XML 메타데이터 엔진 설계)

  • Shin Eun Young;PARK Sung Han
    • Journal of Broadcast Engineering
    • /
    • v.10 no.2
    • /
    • pp.202-209
    • /
    • 2005
  • In this paper, we propose a design of the metadata engine of the video recording system to minimize the retrieval time. For this purpose, the proposed metadata engine stores the XML metadata as a separated fragment and construct a hierarchical indexing scheme based on the contextual and structural properties of metadata. The hierarchical indexing scheme is consisted of a node index for basic searching and a group index for advanced searching. In this way our approach can minimize the number of indexes and thus the retrieval time. Our simulation results show that the response time of our proposed system is shorter than that of the previous works.

RDB Storage Model of XML Instance based on the Edge-Lageled Graph (Edge-Labeled Graph에 기반 한 XML 인스턴스의 RDB 저장 모델)

  • 김정희;김정필;곽호영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.545-547
    • /
    • 2003
  • 본 논문에서는 Edge-Labeled Graph에 기반하여 XML 인스턴스들을 관계형 데이터베이스(RDB)로 저장하는 모델을 제안하고 구현한다. 저장되는 XML 인스턴스들은 Edge-Libeled Graph에 기반 한 Data Graph로 표현되고 이를 이용하여 데이터 경로(Data Path), 요소(Element), 속성(Attribute), 테이블 인덱스(Table Index) 테이블에 정의된 값들이 추출된 후 Napper를 이용하여 데이터베이스 스키마를 정의하고 추출된 값들을 저장한다. 그리고, RDB 저장 모델은 질의를 지원하기 위해, XPATH를 따르는 질의 언어로 사용되는 XQL을 SQL로 변환하는 변환기를 제공하며, 또한 저장된 XML 인스턴스를 복원하는 DBtoXML 처리기를 갖도록 하였다. 구현 결과, XML 인스턴스들과 RDB 구조로의 저장 관계가 그래프(Graph) 기반의 경로(Path)를 이용한 표현으로 가능했으며, 동시에, 특정 요소 (Element) 또는 속성(Attribute)들의 정보들을 쉽게 검색할 수 있는 가능성을 보였다.

  • PDF