• Title/Summary/Keyword: XML Databases

Search Result 230, Processing Time 0.022 seconds

An Efficient Sequence Matching Method for XML Query Processing (XML 질의 처리를 위한 효율적인 시퀀스 매칭 기법)

  • Seo, Dong-Min;Song, Seok-Il;Yoo, Jae-Soo
    • Journal of KIISE:Databases
    • /
    • v.35 no.4
    • /
    • pp.356-367
    • /
    • 2008
  • As XML is gaining unqualified success in being adopted as a universal data representation and exchange format, particularly in the World Wide Web, the problem of querying XML documents poses interesting challenges to database researcher. Several structural XML query processing methods, including XISS and XR-tree, for past years, have been proposed for fast query processing. However, structural XML query processing has the problem of requiring expensive Join cost for twig path query Recently, sequence matching based XML query processing methods, including ViST and PRIX, have been proposed to solve the problem of structural XML query processing methods. Through sequence matching based XML query processing methods match structured queries against structured data as a whole without breaking down the queries into sub queries of paths or nodes and relying on join operations to combine their results. However, determining the structural relationship of ViST is incorrect because its numbering scheme is not optimized. And PRIX requires many processing time for matching LPS and NPS about XML data trees and queries. Therefore, in this paper, we propose efficient sequence matching method u sing the bottom-up query processing for efficient XML query processing. Also, to verify the superiority of our index structure, we compare our sequence matching method with ViST and PRIX in terms of query processing with linear path or twig path including wild-card('*' and '//').

Bitmap Indexes and Query Processing Strategies for Relational XML Twig Queries (관계형 XML 가지 패턴 질의를 위한 비트맵 인덱스와 질의 처리 기법)

  • Lee, Kyong-Ha;Moon, Bong-Ki;Lee, Kyu-Chul
    • Journal of KIISE:Databases
    • /
    • v.37 no.3
    • /
    • pp.146-164
    • /
    • 2010
  • Due to an increasing volume of XML data, it is considered prudent to store XML data on an industry-strength database system instead of relying on a domain specific application or a file system. For shredded XML data stored in relational tables, however, it may not be straightforward to apply existing algorithms for twig query processing, since most of the algorithms require XML data to be accessed in a form of streams of elements grouped by their tags and sorted in a particular order. In order to support XML query processing within the common framework of relational database systems, we first propose several bitmap indexes and their strategies for supporting holistic twig joining on XML data stored in relational tables. Since bitmap indexes are well supported in most of the commercial and open-source database systems, the proposed bitmapped indexes and twig query processing strategies can be incorporated into relational query processing framework with more ease. The proposed query processing strategies are efficient in terms of both time and space, because the compressed bitmap indexes stay compressed during data access. In addition, we propose a hybrid index which computes twig query solutions with only bit-vectors, without accessing labeled XML elements stored in the relational tables.

IntoPub: A Directory Server for Bioinformatics Tools and Databases

  • Jung, Dong-Soo;Kim, Ji-Han;Lee, Sang-Hyuk;Lee, Byung-Wook
    • Interdisciplinary Bio Central
    • /
    • v.3 no.3
    • /
    • pp.12.1-12.3
    • /
    • 2011
  • Bioinformatics tools and databases are useful for understanding and processing various biological data. Numerous resources are being published each year. It is not a trivial task to find up-to-date relevant tools and databases. Moreover, no server is available to provide comprehensive coverage on bioinformatics resources in all biological fields. Here, we present a directory server called IntoPub that provides information on web resources. First, we downloaded XML-formatted abstracts containing web URLs from the NCBI PubMed database by using 'ESearch-EFetch' function in the NCBI E-utilities. The information is obtained from abstracts in the PubMed by extracting 'www' or 'http' prefixes. Then, we cu-rate the downloaded abstracts both in automatic and manual fashion. As of July 2011, the IntoPub database has 12,118 abstracts containing web URLs from 174 journals. Our anal-ysis shows that the number of abstracts containing web resources has increased signifi-cantly every year. The server has been tested by many biologists from several countries to get opinion on user satisfaction, usefulness, practicability, and ease of use since January 2010. In the IntoPub web server, users can easily find relevant bioinformatics resources, as compared to searching in PubMed. IntoPub will continue to update and incorporate new web resources from PubMed and other literature databases. IntoPub, available at http://into.kobic.re.kr/, is updated every day.

An Automatic Schema Generation System based on the Contents for Integrating Web Information Sources (웹 정보원 통합을 위한 내용 기반의 스키마 자동생성시스템)

  • Kwak, Jun-Young;Bae, Jong-Min
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.77-86
    • /
    • 2008
  • The Web information sources can be regarded as the largest distributed database to the users. By virtually integrating the distributed information sources and regarding them as a single huge database, we can query the database to extract information. This capability is important to develop Web application programs. We have to infer a database schema from browsing-oriented Web documents in order to integrate databases. This paper presents a heuristic algorithm to infer the XML Schema fully automatically from semi-structured Web documents. The algorithm first extracts candidate pattern regions based on predefined structure-making tags, and determines a target pattern region using a few heuristic factors, and then derives XML Schema extraction rules from the target pattern region. The schema extraction rule is represented in XQuery, which makes development of various application systems possible using open standard XML tools. We also present the experimental results for several public web sources to show the effectiveness of the algorithm.

  • PDF

FiST: XML Document Filtering by Sequencing Twig Patterns (가지형 패턴의 시퀀스화를 이용한 XML 문서 필터링)

  • Kwon Joon-Ho;Rao Praveen;Moon Bong-Ki;Lee Suk-Ho
    • Journal of KIISE:Databases
    • /
    • v.33 no.4
    • /
    • pp.423-436
    • /
    • 2006
  • In recent years, publish-subscribe (pub-sub) systems based on XML document filtering have received much attention. In a typical pub-sub system, subscribing users specify their interest in profiles expressed in the XPath language, and each new content is matched against the user profiles so that the content is delivered only to the interested subscribers. As the number of subscribed users and their profiles can grow very large, the scalability of the system is critical to the success of pub-sub services. In this paper, we propose a novel scalable filtering system called FiST(Filtering by Sequencing Twigs) that transforms twig patterns expressed in XPath and XML documents into sequences using Prufer's method. As a consequence, instead of matching linear paths of twig patterns individually and merging the matches during post-processing, FiST performs holistic matching of twig patterns with incoming documents. FiST organizes the sequences into a dynamic hash based index for efficient filtering. We demonstrate that our holistic matching approach yields lower filtering cost and good scalability under various situations.

An XML Database System for 3-Dimensional Graphic Images (3차원 그래픽 이미지를 위한 XML 데이타베이스 시스템)

  • Hwang, Jong-Ha;Hwang, Su-Chan
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.110-118
    • /
    • 2002
  • This paper presents a 3-D graphic database system based on XML that supports content-based retrievals of 3-D images, Most of graphics application systems are currently centered around the processing of 2-D images and research works on 3-D graphics are mainly concerned about the visualization aspects of 3-D image. They do not support the semantic modeling of 3-D objects and their spatial relations. In our data model, 3-D images are represented as compositions of 3-D graphic objects with associated spatial relations. Complex 3-D objects are mode]ed using a set of primitive 3-D objects rather than the lines and polygons that are found in traditional graphic systems. This model supports content-based retrievals of scenes containing a particular object or those satisfying certain spatial relations among the objects contained in them. 3-D images are stored in the database as XML documents using 3DGML DTD that are developed for modeling 3-D graphic data. Finally, this paper describes some examples of query executed in our Web-based prototype database system.

A Study of the Integrated Operation for Databases with Different Data Structures (상이한 데이터 구조의 데이터베이스간 통합 운영방안 연구 - 기초학문자료센터를 중심으로 -)

  • Ko, Young-Man;Bae, Kyung-Jae
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.45 no.3
    • /
    • pp.69-85
    • /
    • 2011
  • This study reviewed theories for database integration, which combines heterogeneous data structures, and suggested a practical method to integrate databases of Korean Research Memory(KRM) and Infrastructural Basic Research(IBR) as a case study. In order to broadly distribute the outcomes of IBR, it is essential to be connected to and integrated with the database of KRM. As a solution, it was suggested that the current database of IBR should follow standard guidelines as a XML database, and its future database should be integrated with the database of KRM or be established as a stand-alone system

PrimeFilter: An Efficient XML Data Filtering based on Prime Number Indexing (PrimeFilter: 소수 인덱싱 기법에 기반한 효율적 XML 데이타 필터링)

  • Kim, Jae-Hoon;Kim, Sang-Wook;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.35 no.5
    • /
    • pp.421-431
    • /
    • 2008
  • Recently XML is becoming a de facto standard for online data exchange between heterogeneous systems and also the research of streaming XML data filtering comes into the spotlight. Since streaming XML data filtering technique needs rapid matching of queries with XML data, it is required that the query processing should be efficiently performed. Until now, most of researches focused only on partial sharing of path expressions or efficient predicate processing and they were work for time and space efficiency. However, if containment relationship between queries is previously calculated and the lowest level query is matched with XML data, we can easily get a result that high level queries can match with the XML data without any other processing. That is, using this containment technique can be another optimal solution for streaming XML data filtering. In this paper, we suggest an efficient XML data filtering based on prime number indexing and containment relationship between queries. Through some experimental results, we present that our suggested method has a better performance than the existing method. All experiments have shown that our method has a more than two times better performance even though each experiment has its own distinct test purpose.

A Database Schema Integration Method Using XML Schema (XML Schema를 이용한 이질의 데이터베이스 스키마 통합)

  • 박우창
    • Journal of Internet Computing and Services
    • /
    • v.3 no.2
    • /
    • pp.39-56
    • /
    • 2002
  • In distributed computing environments, there are many database applications that should share data each other such as data warehousing and data mining with autonomy on local databases. The first step to such applications is the integration of heterogeneous database schema, but there is no accepted common data model for the integration and also are difficulties on the construction of integration program. In this paper, we use the XML Schema for the representation of common data model and exploit XSLT for reducing the programming difficulties. We define the schema integration operations and develop a methodology for the semi-automatic schema integration according to schema conflicts types. Our integration method has benefits on standardization, extendibility on schema integration process comparing to existing methodologies.

  • PDF

Design and Implementation of Multimedia Retrieval a System (멀티미디어 검색 시스템의 설계 및 구현)

  • 노승민;황인준
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.494-506
    • /
    • 2003
  • Recently, explosive popularity of multimedia information has triggered the need for retrieving multimedia contents efficiently from the database including audio, video and images. In this paper, we propose an XML-based retrieval scheme and a data model that complement the weak aspects of annotation and conent based retrieval methods. The Property and hierarchy structure of image and video data are represented and manipulated based on the Multimedia Description Schema (MDS) that conforms to the MPEG-7 standard. For audio contents, pitch contours extracted from their acoustic features are converted into UDR string. Especially, to improve the retrieval performance, user's access pattern and frequency are utilized in the construction of an index. We have implemented a prototype system and evaluated its performance through various experiments.