• Title/Summary/Keyword: XML Tree

Search Result 148, Processing Time 0.023 seconds

A Tree-structured XPath Query Reduction Scheme for Enhancing XML Query Processing Performance (XML 질의의 수행성능 향상을 위한 트리 구조 XPath 질의의 축약 기법에 관한 연구)

  • Lee, Min-Soo;Kim, Yun-Mi;Song, Soo-Kyung
    • The KIPS Transactions:PartD
    • /
    • v.14D no.6
    • /
    • pp.585-596
    • /
    • 2007
  • XML data generally consists of a hierarchical tree-structure which is reflected in mechanisms to store and retrieve XML data. Therefore, when storing XML data in the database, the hierarchical relationships among the XML elements are taken into consideration during the restructuring and storing of the XML data. Also, in order to support the search queries from the user, a mechanism is needed to compute the hierarchical relationship between the element structures specified by the query. The structural join operation is one solution to this problem, and is an efficient computation method for hierarchical relationships in an in database based on the node numbering scheme. However, in order to process a tree structured XML query which contains a complex nested hierarchical relationship it still needs to carry out multiple structural joins and results in another problem of having a high query execution cost. Therefore, in this paper we provide a preprocessing mechanism for effectively reducing the cost of multiple nested structural joins by applying the concept of equivalence classes and suggest a query path reduction algorithm to shorten the path query which consists of a regular expression. The mechanism is especially devised to reduce path queries containing branch nodes. The experimental results show that the proposed algorithm can reduce the time requited for processing the path queries to 1/3 of the original execution time.

XML Clustering Technique by Genetic Algorithm (유전자 알고리즘을 통한 XML 군집화 방법)

  • Kim, Woo-Saeng
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.3
    • /
    • pp.1-7
    • /
    • 2012
  • Recently, researches are studied in developing efficient techniques for accessing, querying, and managing XML documents which are frequently used in the Internet. In this paper, we propose a new method to cluster XML documents efficiently. An element of a XML document corresponds to a node of the corresponding tree and an inclusion relationship of the document corresponds to a relationship between parent and child node of the tree. Therefore, similar XML documents are similar to the node's name and level of the corresponding trees. We make evaluation function with this characteristic to cluster XML documents by genetic algorithm. The experiment shows that our proposed method has better performance than other existing methods.

An XML Query Optimization Technique by Signature based Block Traversing (시그니처 기반 블록 탐색을 통한 XML 질의 최적화 기법)

  • Park, Sang-Won;Park, Dong-Ju;Jeong, Tae-Seon;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.79-88
    • /
    • 2002
  • Data on the Internet are usually represented and transfered as XML. the XML data is represented as a tree and therefore, object repositories are well-suited to store and query them due to their modeling power. XML queries are represented as regular path expressions and evaluated by traversing each object of the tree in object repositories. Several indexes are proposed to fast evaluate regular path expressions. However, in some cases they may not cover all possible paths because they require a great amount of disk space. In order to efficiently evaluate the queries in such cases, we propose an optimized traversing which combines the signature method and block traversing. The signature approach shrink the search space by using the signature information attached to each object, which hints the existence of a certain label in the sub-tree. The block traversing reduces disk I/O by early evaluating the reachable objects in a page. We conducted diverse experiments to show that the hybrid approach achieves a better performance than the other naive ones.

Program Plagiarism Detection based on X-treeDiff+ (X-treeDiff+ 기반의 프로그램 복제 탐지)

  • Lee, Suk-Kyoon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.4
    • /
    • pp.44-53
    • /
    • 2010
  • Program plagiarism is a significant factor to reduce the quality of education in computer programming. In this paper, we propose the technique of identifying similar or identical programs in order to prevent students from reckless copying their programming assignments. Existing approaches for identifying similar programs are mainly based on fingerprints or pattern matching for text documents. Different from those existing approaches, we propose an approach based on the program structur. Using paring progrmas, we first transform programs into XML documents by representing syntactic components in the programs with elements in XML document, then run X-tree Diff+, which is the change detection algorithm for XML documents, and produce an edit script as a change. The decision of similar or identical programs is made on the analysis of edit scripts in terms of program plagiarism. Analysis of edit scripts allows users to understand the process of conversion between two programs so that users can make qualitative judgement considering the characteristics of program assignment and the degree of plagiarism.

A Unification Algorithm for DTDs of XML Documents having a Similar Structure (유사 구조를 가지는 XML 문서들의 DTD 통합 알고리즘)

  • 유춘식;우선미;김용성
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1400-1411
    • /
    • 2004
  • There are many cases that many XML documents have different DTDs in spite of having a similar structure and being logically same kind of document. For this reason, It occurs a problem that these XML documents have different database schema and are stored in different databases. So, in this paper, we propose an algorithm that unifies DTDs of these XML documents using the finite automata and the tree structure. The finite automata is suitable for representing repetition operators and connectors of DTD, and is a simple representation method for DTD. By using the finite automata, we are able to reduce the complexity of algorithm. And we apply a proposed algorithm to unify DTDs of science journals.

Structure Based Information Retrieval Algorithm Using XML Technology and String Matching Algorithm (XML 기술과 스트링 매칭 기법을 이용한 구조 기반 정보 검색 알고리즘)

  • Han, Gi-Deok;Kwon, Hyuk-Chul
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06c
    • /
    • pp.171-176
    • /
    • 2007
  • Parsing 작업의 결과인 Parse Tree 정보는 문장에 관한 구조적 정보를 가지고 있는 Tree 정보로 이 정보를 이용하여 정보 검색에 활용하는 알고리즘을 제안한다. 제안하는 알고리즘은 XML 기술과 스트링 매칭 기법을 이용하였으며, 사용한 스트링 매칭 기법은 Approximate String Matching 기법이다. Query 정보와 문서 정보를 Parsing하여 얻은 Parse Tree를 XML 형태의 정보로 변환한 후, 두 정보를 가지고 Approximate String Matching 기법을 적용하여 Query 정보와 문서 정보 간의 유사도를 계산한다. 제안하는 알고리즘의 장점은 구조 기반의 정보 검색 기능이 가능하고 비슷한 정보에 대한 검색 기능이 가능하며 비슷한 구조에 대한 검색 기능이 가능하다는 것이다.

  • PDF

A Prototype Implementation of XML Document Version Management System Using X-treeDiff (X-treeDiff를 이용한 XML 문서의 버전 관리 시스템 프로트타입 개발)

  • Kim, Sung-Joon;Kim, Dong-Ah;Lee, Suk-Kyoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.11c
    • /
    • pp.1343-1346
    • /
    • 2003
  • 현재 많은 정보시스템들이 웹을 기반으로 다양한 전자 문서들을 제공하고 있다. 이러한 환경 하에서 지속적인 갱신이 이루어지는 문서들을 관리하는 응용분야에서는 이들 문서들에 대한 효율적 관리 기법이 요구되고 있다. 본 논문에서는 최근 제안된 X-treeDiff를 통해 계산된 편집 스크지트를 기반으로 한 XML 문서들에 대한 버전 관리 시스템을 제안하고 이에 대한 프로토타입의 구현을 보인다. 제안된 버전 관리 시스템은 CVS와 같은 대부분의 텍스트 기반 시스템과는 달리 트리 데이터 구조의 문서를 위한 시스템으로 XML과 같은 트리구조 문서 관리에 효과적이다.

  • PDF

Data Transformation through Mapping between XML and Relation Database (XML과 관계형 데이타베이스 매핑을 통한 자료의 변환)

  • Kim Gil-Choon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.9 no.4 s.32
    • /
    • pp.5-12
    • /
    • 2004
  • The data transformation between XML and Relation Database is made through the Principle of mapping bewtween them. There are two ways to access SQL Server, one is to assign SQL query to URL and the other is to use template file. MS-SQL server takes advantage of OpenXML function to transform the results of executing SQL query into XML documents. That is, OpenXML first makes node tree and then transforms row set data of XML documents into XML data of relation type. In order to insert XML data into database data. data is extracted from parsing XML documents using sp_xml_preparedocument procedure, and then the document structure is mapped into tree structure and stored in a table of database. Consequently, Data transformation between XML and Relation Database is made through mapping bewtween them. This article proposes the principle of mapping between XML and Relation Database and then shows the implementation of transformation between them so that it introduces the possibilty of bringing the extension and efficiency of data and various effects.

  • PDF

A Suffix Tree Approach for Efficient XML Path Indexing (접미어 트리 구조를 이용한 효율적인 XML 경로 인덱싱)

  • 이덕형;원정임;노관준;윤지희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.88-90
    • /
    • 2002
  • 최근 인터넷 상에서 XML 문서의 사용이 급속도로 보편화, 일반화됨 따라 정보 검색을 위한 다양한 XML 질의 언어가 제안되고 있다. XML 질의의 공통 특징으로서 ‘*’ 문자 등을 사용한 정규화 경로식(regular path expression)에 의한 손쉬운 구조정보 검색 기능을 들 수 있다. 본 논문에서는 접미어 트리(suffix tree)를 이용한 새로운 경로 인덱싱 기법을 제안한다. 제안하는 기법에서는 XML 문서상의 각 경로를 축약된 유일한 문자열로 인코딩하며, 인코딩 된 각 문자열의 모든 접미어 정보를 인덱스에 저장한다. 본 기법은 일반 정규화 경로식을 포함하는 구조질의를 매우 효율적으로 처리하며, 또한 경로 정보가 부정확하게 기술된 경우에도 관사 질의 처리를 효과적으로 처리할 수 있다.

  • PDF

Design of Formalized message exchanging method using XMDR (XMDR을 이용한 정형화된 메시지 교환 기법 설계)

  • Hwang, Chi-Gon;Jung, Kye-Dong;Choi, Young-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.6
    • /
    • pp.1087-1094
    • /
    • 2008
  • Recently, XML has been widely used as a standard for a data exchange, and there has emerged the tendency that the size of XML document becomes larger. The data transfer can cause problems due to the increase in traffic, especially when a massive data such as Data Warehouse is being collected and analyzed. Therefore, an XMDR wrapper can solve this problem since it analyzes the tree structures of XML Schema, regenerates XML Schema using the analyzed tree structures, and sends it to each station with an XMDR Query. XML documents which are returned as an outcome encode XML tags according to XML Schema, and send standardized messages. As the formalized XML documents decrease network traffic and comprise XML class information, they are efficient for extraction, conversion, and alignment of data. In addition, they are efficient for the conversion process through XSLT, too, as they have standardized forms. In this paper we profuse a method in which XML Schema and XMDR_Query sent to each station are generated through XMDR(extended Meta-Data Registry) and the generation of products and XML conversion occur in each station wrapper.