• Title/Summary/Keyword: XML Databases

Search Result 230, Processing Time 0.029 seconds

An RDBMS-based Inverted Index Technique for Path Queries Processing on XML Documents with Different Structures (상이한 구조의 XML문서들에서 경로 질의 처리를 위한 RDBMS기반 역 인덱스 기법)

  • 민경섭;김형주
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.420-428
    • /
    • 2003
  • XML is a data-oriented language to represent all types of documents including web documents. By means of the advent of XML-based document generation tools and grow of proprietary XML documents using those tools and translation from legacy data to XML documents at an accelerating pace, we have been gotten a large amount of differently-structured XML documents. Therefore, it is more and more important to retrieve the right documents from the document set. But, previous works on XML have mainly focused on the storage and retrieval methods for a large XML document or XML documents had a same DTD. And, researches that supported the structural difference did not efficiently process path queries on the document set. To resolve the problem, we suggested a new inverted index mechanism using RDBMS and proved it outperformed the previous works. And especially, as it showed the higher efficiency in indirect containment relationship, we argues that the index structure is fit for the differently-structured XML document set.

A Clustering Method Based on Path Similarities of XML Data (XML 데이타의 경로 유사성에 기반한 클러스터링 기법)

  • Choi Il-Hwan;Moon Bong-Ki;Kim Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.33 no.3
    • /
    • pp.342-352
    • /
    • 2006
  • Current studies on storing XML data are focused on either mapping XML data to existing RDBMS efficiently or developing a native XML storage. Some native XML storages store each XML node with parsed object form. Clustering, the physical arrangement of each object, can be an important factor to increase the performance with this storing method. In this paper, we propose re-clustering techniques that can store an XML document efficiently. Proposed clustering technique uses path similarities among data nodes, which can reduce page I/Os when returning query results. And proposed technique can process a path query only using small number of clusters as possible instead of using all clusters. This enables efficient processing of path query because we can reduce search space by skipping unnecessary data. Finally, we apply existing clustering techniques to store XML data and compare the performance with proposed technique. Our results show that the performance of XML storage can be improved by using a proper clustering technique.

Sequence Group Validation based on Boundary Locking for Valid XML Documents (유효한 XML 문서에 대한 경계 로킹에 기반한 시퀀스 그룹 검증 기법)

  • Choi, Yoon-Sang;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.628-640
    • /
    • 2005
  • The XML is well accepted in several different Web application areas. As soon as many users and applications work concurrently on the same collection of XML documents, isolating accesses and modifications of different transactions becomes an important issue. When an XML document correctly corresponds to the rules laid out in a DTD or XML schema, it is also said to be valid. The valid XML document's validity should be guaranteed after the document is updated. The validation method mentioned above, however, results in lower degree of concurrency. For getting higher degree of concurrency and minimizing the range of the XML document validity, a new validation method based on a specific locking method is required. In this paper we propose the sequence group validation method for minimizing the range of the XML document validity. We also propose the boundary locking method for isolating accesses and modifications of different transactions while supporting the valid XML document's validity. Finally, the results of some experiments show the validation and locking methods increase the degree of transaction concurrency.

Storage Schemes for XML Query Cache (XML 질의 캐쉬의 저장 기법)

  • Kim, Young-Hyun;Kang, Hyun-Chul
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.551-562
    • /
    • 2006
  • XML query caching for XML database-backed Web applications began to be investigated recently. Despite its practical significance, efficiency of the storage schemes for cached query results has not been addressed. In this paper, we deal with the storage schemes for XML query cache. A fundamental problem that needs to be considered in designing an efficient storage structure for XML query cache is that there exist performance tradeoffs between the two major types of operations on a cached query result. The two are (1) retrieving the whole of it to return the query result and (2) updating just a small portion of it for its incremental refresh against the updates done to its source. We propose eight different storage schemes for XML query cache, which are categorized into three groups: (1) the schemes based on the plain text file, (2) the schemes based on the persistent DOM (PDOM) file, and (3) a scheme employing an RDBMS. We implemented all of them, and compared their performance with each other. We also compared our proposal with a storage scheme based on a current state-of-the-art XML storage scheme, showing that ours is more efficient.

Design and Implementation of a Translator form XQuery to SQL : 2003 (XQuery SQL:2003 번역기 설계 및 구현)

  • Kim, Song-Hyon;Park, Young-Sup;Lee, Yoon-Joon
    • Journal of KIISE:Databases
    • /
    • v.33 no.7
    • /
    • pp.668-681
    • /
    • 2006
  • Due to its diverse advantages, XML has secured its position as a standard for data representation and exchange in the Internet. As a consequence, there has been much research on efficient storing and query processing of in data. Storing XML data in a relational database system warrants much benefit in data management and query processing; the system provides a strong query processing and data management function and can be applicable to XML data, its function being extended. In this paper, we design and implement a query translator that translates XQuery, a representative XML query language, into SQL:2003 query. SQL:2003, the latest SQL standard used as a substitute for SQL:1999, defines SQL/XML that supports XML. The main contribution of this paper is as follows: First, we look into the supporting features of XML, defined in the SQL:2003 standard, and propose a user-defined function for shortcoming sections. Second, we propose a way to translate XQuery into SQL that observes the latest SQL standard. Third, we describe in detail the design and the implementation of the translator to show its feasibility as a translator.

A Circle Labeling Scheme without Re-labeling for Dynamically Updatable XML Data (동적으로 갱신가능한 XML 데이터에서 레이블 재작성하지 않는 원형 레이블링 방법)

  • Kim, Jin-Young;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.150-167
    • /
    • 2009
  • XML has become the new standard for storing, exchanging, and publishing of data over both the internet and the ubiquitous data stream environment. As demand for efficiency in handling XML document grows, labeling scheme has become an important topic in data storage. Recently proposed labeling schemes reflect the dynamic XML environment, which itself provides motivation for the discovery of an efficient labeling scheme. However, previous proposed labeling schemes have several problems: 1) An insertion of a new node into the XML document triggers re-labeling of pre-existing nodes. 2) They need larger memory space to store total label. etc. In this paper, we introduce a new labeling scheme called a Circle Labeling Scheme. In CLS, XML documents are represented in a circular form, and efficient storage of labels is supported by the use of concepts Rotation Number and Parent Circle/Child Circle. The concept of Radius is applied to support inclusion of new nodes at arbitrary positions in the tree. This eliminates the need for re-labeling existing nodes and the need to increase label length, and mitigates conflict with existing labels. A detailed experimental study demonstrates efficiency of CLS.

Object-Oriented Database Schemata and Queiy Processing for XML Data (XML 데이타를 위한 객체지향 데이터베이스 스키마 및 질의 처리)

  • Jeong, Tae-Seon;Park, Sang-Won;Han, Sang-Yeong;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.89-98
    • /
    • 2002
  • As XML has become an emerging standard for information exchange on the World Wide Web it has gained attention in database communities to extract information from XML seen as a database model. Recently, many researchers have addressed the problem of storing XML data and processing XML queries using traditional database engines. Here, most of them have used relational database systems. In this paper, we show that OODBSs can be another solution. Our technique generates an OODB schema from DTDs and processes XML queries, Especially, we show that the semi-structural part of XML data can be represented by the 'inheritance' and that this can be used to improve query processing.

An Ontology-based Knowledge Management System - Integrated System of Web Information Extraction and Structuring Knowledge -

  • Mima, Hideki;Matsushima, Katsumori
    • Proceedings of the CALSEC Conference
    • /
    • 2005.03a
    • /
    • pp.55-61
    • /
    • 2005
  • We will introduce a new web-based knowledge management system in progress, in which XML-based web information extraction and our structuring knowledge technologies are combined using ontology-based natural language processing. Our aim is to provide efficient access to heterogeneous information on the web, enabling users to use a wide range of textual and non textual resources, such as newspapers and databases, effortlessly to accelerate knowledge acquisition from such knowledge sources. In order to achieve the efficient knowledge management, we propose at first an XML-based Web information extraction which contains a sophisticated control language to extract data from Web pages. With using standard XML Technologies in the system, our approach can make extracting information easy because of a) detaching rules from processing, b) restricting target for processing, c) Interactive operations for developing extracting rules. Then we propose a structuring knowledge system which includes, 1) automatic term recognition, 2) domain oriented automatic term clustering, 3) similarity-based document retrieval, 4) real-time document clustering, and 5) visualization. The system supports integrating different types of databases (textual and non textual) and retrieving different types of information simultaneously. Through further explanation to the specification and the implementation technique of the system, we will demonstrate how the system can accelerate knowledge acquisition on the Web even for novice users of the field.

  • PDF

Shredding XML Documents into Relations using Structural Redundancy (구조적 중복을 사용한 XML 문서의 릴레이션으로의 분할저장)

  • Kim Jaehoon;Park Seog
    • Journal of KIISE:Databases
    • /
    • v.32 no.2
    • /
    • pp.177-192
    • /
    • 2005
  • In this paper, we introduce a structural redundancy method. It reduces the query processing cost incurred when reconfiguring an XML document from divided XML data in shredding XML documents into relations. The fundamental idea is that query performance can be enhanced by analyzing query patterns and replicating data essential for the query performance. For the practical and effective structural redundancy, we analyzed three types of ID, VALUE, and SUBTREE replication. In addition, if given XML data and queries are very large and complex, it can be very difficult to search optimal redundancy set. Therefore, a heuristic search method is introduced in this paper. Finally, XML query processing cost arising by employing the structural redundancy, and the efficiency of proposed search method arc analyzed experimentally It is manifest that XML read query is performed more quick]y but XML update query is performed more slowly due to the additional update consistency cost for replicas. However, experimental results showed that in-place ID replication is useful even in having excessive update cost. It was also observed that multiple-place SUBTREE replication can enhance read query performance remarkably if only update cost is not excessive.

A Flexible Query Processing System for XML Regular Path Expressions (XML 정규 경로식을 위한 유연한 질의 처리 시스템)

  • 김대일;김기창;김유성
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.641-650
    • /
    • 2003
  • The eXtensible Markup Language(XML) is emerging as a standard format of data representation and exchange on the Internet. There have been researches about storing and retrieving XML documents using the relational database which has techniques in full growth about large data processing, recovery, concurrency control and so on. Since in previous systems same structure information and fundamental operation are used for processing of various kinds of XML queries, only some specific query can be efficiently processed not all types of query. In this paper, we propose a flexible query processing system. To process query efficiently, the proposed system analyzes regular path expression queries, and uses $\theta$-join operation using region numbering values to check ancestor-descendent relationship and equi-join operation using parent's region start value to check parent-child relationship. Thus, the proposed system processes efficiently XML regular path expressions. From the experimental results, we show that proposed XML query processing system is more efficient than previous systems.