• Title/Summary/Keyword: Regular Path Expression

Search Result 14, Processing Time 0.025 seconds

Processing of Multiple Regular Path Expressions using PID (경로 식별자를 이용한 다중 정규경로 처리기법)

  • Kim, Jong-Ik;Jeong, Tae-Seon;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.4
    • /
    • pp.274-284
    • /
    • 2002
  • Queries on XML are based on paths in the data graph, which is represented as an edge labeled graph model. All proposed query languages for XML express queries using regular expressions to traverse arbitrary paths in the data graph. A meaningful query usually has several regular path expressions in it, but much of recent research is more concerned with optimizing a single path expression. In this paper, we present an efficient technique to process multiple path expressions in a query. We developed a data structure named as the path identifier(PID) to identify whether two given nodes lie on the fame path in the data graph or not, and utilized the PID for efficient processing of multiple path expressions. We implement our technique and present preliminary performance results.

Genealogy-based Indexing Technique for XML Documents (XML문서를 위한 족보 기반 인덱싱 기법)

  • 이월영;용환승
    • Journal of KIISE:Databases
    • /
    • v.31 no.1
    • /
    • pp.72-81
    • /
    • 2004
  • Theses days, a number of data over the Internet are represented using XML because of a virtue of XML. In proportion to the increase of XML data, query processing techniques are required that support quickly and efficiently the diverse queries to search the useful information on XML documents. But, up to now, the researches handling queries for XML data are methodologies focusing on how to process regular path expressions. Therefore, we have developed a new genealogy-based indexing technique to solve various queries such as not only regular path expression but also simple path expression, path expression referencing other elements, and so on. Also, we have applied this technique on object-relational model and evaluated the performance for many documents and various query types. The result shows improved performance in comparison with the other storage techniques.

A Query Pruning Technique for Optimizing Regular Path Expressions in Semistructured Databases (준구조적 데이타베이스에서의 정규경로표현 최적화를 위한 질의전지 기법)

  • Park, Chang-Won;Jeong, Jin-Wan
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.217-229
    • /
    • 2002
  • Regular path expressions are primary elements for formulating queries over the semistructured data that does not assume the conventional schemas. In addition, the query pruning is an important optimization technique to avoid useless traversals in evaluating regular path expressions. However, the existing query pruning often fails to fully optimize multiple regular path expressions, and the previous methods that post-process the result of the existing query pruning must check exponential combinations of sub-results. In this paper, we present a new query pruning technique that consists of the preprocessing phase and the pruning phase. Our two-phase query pruning is affective in optimizing multiple regular path expressions, and is more scalable than the previous methods in that it never check the exponential combinations of sub-results.

An Efficient Technique for Evaluating Queries with Multiple Regular Path Expressions (다중 정규 경로 질의 처리를 위한 효율적 기법)

  • Chung, Tae-Sun;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.449-457
    • /
    • 2001
  • As XML has become an emerging standard for information exchange on the World Wide Web, it has gained attention in database communities to extract information from XML seen as a database model. XML queries are based on regular path queries, which find objects reachable by given regular expressions. To answer many kinds of user queries, it is necessary to evaluate queries that have multiple regular path expressions. However, previous work such as query rewriting and query optimization in the frame work of semistructured data has dealt with a single regular expression. For queries that have multiple regular expressions we suggest a two phase optimizing technique: 1. query rewriting using views by finding the mappings from the view's body to the query's body and 2. for rewritten queries, evaluating each query conjunct and combining them. We show that our rewriting algorithm is sound and our query evaluation technique is more efficient than the previous work on optimizing semistructured queries.

  • PDF

Test Case Generation of Communication Protocol with Regular Expressions (정규표현식을 이용한 통신 프로토콜의 최소 시험 경로 생성)

  • 김한경
    • Journal of Internet Computing and Services
    • /
    • v.2 no.1
    • /
    • pp.1-11
    • /
    • 2001
  • Though it is proposed to use Petri net or dynamic FSM methods for the generation of test sequences on some specific protocols, those methods ere unavailable on the cases where the protocol allows faults processing or includes paths in looping which cause errors or endless looping by the explosion of states. The determination of test coverage on the protocol software that has been designed and implemented is difficult by the reason of development periods, technical solutions to support and also economical limitations. It is suggested to generate timely protocol software test sequences on the basis of regular expressions covering the functions of protocol. With this regular expression method, the 38 test sequences of Q.2971 protocol has been generated and also minimized the endless looping problem when dynamic test suites are used by simplifying the test path expressions that denotes loops, According to the works, the suggested method is confirmed as simple and easy compare to the other dynamic test sequence generation techniques. Moreover. the method to search an optional test path whether it Is included or not in the regular path expression is reviewed.

  • PDF

A Flexible Query Processing System for XML Regular Path Expressions (XML 정규 경로식을 위한 유연한 질의 처리 시스템)

  • 김대일;김기창;김유성
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.641-650
    • /
    • 2003
  • The eXtensible Markup Language(XML) is emerging as a standard format of data representation and exchange on the Internet. There have been researches about storing and retrieving XML documents using the relational database which has techniques in full growth about large data processing, recovery, concurrency control and so on. Since in previous systems same structure information and fundamental operation are used for processing of various kinds of XML queries, only some specific query can be efficiently processed not all types of query. In this paper, we propose a flexible query processing system. To process query efficiently, the proposed system analyzes regular path expression queries, and uses $\theta$-join operation using region numbering values to check ancestor-descendent relationship and equi-join operation using parent's region start value to check parent-child relationship. Thus, the proposed system processes efficiently XML regular path expressions. From the experimental results, we show that proposed XML query processing system is more efficient than previous systems.

Adaptive Path Index for Efficient U Query Processing (효율적인 XML 질의 처리를 위한 적응형 경로 인덱스)

  • 민준기;심규석;정진완
    • Journal of KIISE:Databases
    • /
    • v.31 no.1
    • /
    • pp.61-71
    • /
    • 2004
  • XML can describe a wide range of data, from regular to irregular and from flat to deeply nested. Thus, XML is rapidly emerging as the do facto standard for the Web document format since XML supports an efficient data exchange and integration. Also, to retrieve the data represented by XML, several XML query languages are proposed. XML query languages such as XPath and XQuery use path expressions to traverse irregularly structured data which comprise B% elements. To evaluate path expressions, various path indexes are proposed. However, traditional path indexes are constructed by utilizing only the XML data structure. Therefore, in this paper, we propose an adaptive path index which utilizes the XML data structure as well as query workloads. To improve the query performance, the adaptive path index proposed by this paper manages the frequently used paths and the structural summary of the XML data using a hash tree and a graph structure. Experimental results show that the adaptive path index improves the query performance typically 2 to 69 times compared with the existing indexes.

A Tree-structured XPath Query Reduction Scheme for Enhancing XML Query Processing Performance (XML 질의의 수행성능 향상을 위한 트리 구조 XPath 질의의 축약 기법에 관한 연구)

  • Lee, Min-Soo;Kim, Yun-Mi;Song, Soo-Kyung
    • The KIPS Transactions:PartD
    • /
    • v.14D no.6
    • /
    • pp.585-596
    • /
    • 2007
  • XML data generally consists of a hierarchical tree-structure which is reflected in mechanisms to store and retrieve XML data. Therefore, when storing XML data in the database, the hierarchical relationships among the XML elements are taken into consideration during the restructuring and storing of the XML data. Also, in order to support the search queries from the user, a mechanism is needed to compute the hierarchical relationship between the element structures specified by the query. The structural join operation is one solution to this problem, and is an efficient computation method for hierarchical relationships in an in database based on the node numbering scheme. However, in order to process a tree structured XML query which contains a complex nested hierarchical relationship it still needs to carry out multiple structural joins and results in another problem of having a high query execution cost. Therefore, in this paper we provide a preprocessing mechanism for effectively reducing the cost of multiple nested structural joins by applying the concept of equivalence classes and suggest a query path reduction algorithm to shorten the path query which consists of a regular expression. The mechanism is especially devised to reduce path queries containing branch nodes. The experimental results show that the proposed algorithm can reduce the time requited for processing the path queries to 1/3 of the original execution time.

A Suffix Tree Approach for Efficient XML Path Indexing (접미어 트리 구조를 이용한 효율적인 XML 경로 인덱싱)

  • 이덕형;원정임;노관준;윤지희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.88-90
    • /
    • 2002
  • 최근 인터넷 상에서 XML 문서의 사용이 급속도로 보편화, 일반화됨 따라 정보 검색을 위한 다양한 XML 질의 언어가 제안되고 있다. XML 질의의 공통 특징으로서 ‘*’ 문자 등을 사용한 정규화 경로식(regular path expression)에 의한 손쉬운 구조정보 검색 기능을 들 수 있다. 본 논문에서는 접미어 트리(suffix tree)를 이용한 새로운 경로 인덱싱 기법을 제안한다. 제안하는 기법에서는 XML 문서상의 각 경로를 축약된 유일한 문자열로 인코딩하며, 인코딩 된 각 문자열의 모든 접미어 정보를 인덱스에 저장한다. 본 기법은 일반 정규화 경로식을 포함하는 구조질의를 매우 효율적으로 처리하며, 또한 경로 정보가 부정확하게 기술된 경우에도 관사 질의 처리를 효과적으로 처리할 수 있다.

  • PDF

An Accurate Log Object Recognition Technique

  • Jiho, Ju;Byungchul, Tak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.2
    • /
    • pp.89-97
    • /
    • 2023
  • In this paper, we propose factors that make log analysis difficult and design technique for detecting various objects embedded in the logs which helps in the subsequent analysis. In today's IT systems, logs have become a critical source data for many advanced AI analysis techniques. Although logs contain wealth of useful information, it is difficult to directly apply techniques since logs are semi-structured by nature. The factors that interfere with log analysis are various objects such as file path, identifiers, JSON documents, etc. We have designed a BERT-based object pattern recognition algorithm for these objects and performed object identification. Object pattern recognition algorithms are based on object definition, GROK pattern, and regular expression. We find that simple pattern matchings based on known patterns and regular expressions are ineffective. The results show significantly better accuracy than using only the patterns and regular expressions. In addition, in the case of the BERT model, the accuracy of classifying objects reached as high as 99%.