• Title/Summary/Keyword: Structural Query

Search Result 86, Processing Time 0.022 seconds

Building an Integrated Protein Data Management System Using the XPath Query Process

  • Cha Hyo Soung;Jung Kwang Su;Jung Young Jin;Ryu Keun Ho
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.99-102
    • /
    • 2004
  • Recently according to developing of bioinformatics techniques, there are a lot of researches about large amount of biological data. And a variety of files and databases are being used to manage these data efficiently. However, because of the deficiency of standardization there are a lot of problems to manage the data and transform one into the other among heterogeneous formats. We are interested in integrating. saving, and managing gene and protein sequence data generated through sequencing. Accordingly, in this paper the goal of our research is to implement the system to manage sequence data and transform a sequence file format into other format. To satisfy these requirements, we adopt BSML (Bioinformatics Sequence Markup Language) as the standard to manage the bioinformatics data. And then we integrate and store the heterogeneous 리at file formats using BSML schema based DTD. And we developed the system to apply the characteristics of object-oriented database and to process XPath query, one of the efficient structural query. that saves and manages XML documents easily.

  • PDF

Topic Level Disambiguation for Weak Queries

  • Zhang, Hui;Yang, Kiduk;Jacob, Elin
    • Journal of Information Science Theory and Practice
    • /
    • v.1 no.3
    • /
    • pp.33-46
    • /
    • 2013
  • Despite limited success, today's information retrieval (IR) systems are not intelligent or reliable. IR systems return poor search results when users formulate their information needs into incomplete or ambiguous queries (i.e., weak queries). Therefore, one of the main challenges in modern IR research is to provide consistent results across all queries by improving the performance on weak queries. However, existing IR approaches such as query expansion are not overly effective because they make little effort to analyze and exploit the meanings of the queries. Furthermore, word sense disambiguation approaches, which rely on textual context, are ineffective against weak queries that are typically short. Motivated by the demand for a robust IR system that can consistently provide highly accurate results, the proposed study implemented a novel topic detection that leveraged both the language model and structural knowledge of Wikipedia and systematically evaluated the effect of query disambiguation and topic-based retrieval approaches on TREC collections. The results not only confirm the effectiveness of the proposed topic detection and topic-based retrieval approaches but also demonstrate that query disambiguation does not improve IR as expected.

XML Document Retrieval Models for Heterogeneous Data Set using Independent Regular paths (독립적인 질의 경로들을 사용하여 이질적인 문서들을 검색하는 XML 문서 검색 모델)

  • 유신재;민경섭;김형주
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.140-152
    • /
    • 2003
  • An XML document has a structure which may be irregular. It is difficult for end-users to comprehend the irregular document structure exactly. For these XML documents, an end-user has a difficulty in using structured query. Therefore, an end-user formulates no structured query or a query which has a little structure information. In this context, we propose new retrieval models which use the structured information for ranking and compensate the difference between user query structure and document structure. To ease with querying, we assume the independence among querying paths which represent structural constraints. Since this assumption makes degradation of the expression power of a query language, we also propose a model which overcome this problem. As there had been no test collections for XML documents, we made a small test collection from TIPSTER of the RTEC and experimented on this collection without a structured query, From this experiment, we showed that our models improve average precision about 67% over conventional Vector-Space model.

IFC Model Data Retrieval and Regeneration Method through Property Set-based Query Language (IFC 속성 데이터기반의 질의어 개발을 통한 모델 정보 검색 및 재생성 방안)

  • Lee, Sang-Ho;Park, Sang I.;Jang, Young-Hoon;Choi, Kyou-Won
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.2
    • /
    • pp.38-46
    • /
    • 2017
  • In this study, a query language was developed to supplement the information retrieval and model regeneration in the case of Industry Foundation Classes (IFC)-based civil infrastructure information models. First, the IFC objects to represent the structural components, entities to manage the related properties, and relationships to connect with the mentioned elements were analyzed in a point of information flow. The results confirmed that the end-users could have problems with access and comprehend the properties and its relationships in the IFC file. Second, the IfcPropertySet-focused query method and applicable stand-alone module were proposed referring to the previous Building Information Model Query Language (BimQL). The availabilities of the proposed method were examined using the rail and sleeper information models through information retrieval and model regeneration. The most important advantage of the proposed approach is the IFC-based information retrievals that can guarantee the interoperability between software packages.

Implementation and Evaluation of a Web Ontology Storage based on Relation Analysis of OWL Elements and Query Patterns (OWL 요소와 질의 패턴에 대한 관계 분석에 웹 온톨로지 저장소의 구현 및 평가)

  • Jeong, Dong-Won;Choi, Myoung-Hoi;Jeong, Young-Sik;Han, Sung-Kook
    • Journal of KIISE:Databases
    • /
    • v.35 no.3
    • /
    • pp.231-242
    • /
    • 2008
  • W3C has selected OWL as a standard for Web ontology description and a necessity of research on storage models that can store OWL ontologies effectively has been issued. Until now, relational model-based storage systems such as Jena, Sesame, and DLDB, have been developed, but there still remain several issues. Especially, they lead inefficient query processing performance. The structural problems of their low query processing performance are as follow: Jena has a simple structure which is not normalized and also stores most information in a single table. It exponentially decreases the performance because of comparison with unnecessary information for processing queries requiring join operations as well as simple search. The structures of storages(e.g., Sesame) have been completely normalized. Therefore it executes many join operations for query processing. The storages require many join operations to find simply a specific class. This paper proposes a storage model to resolve the problems that the query processing performance is decreased because of non-normalization or complete normalization of the existing storages. To achieve this goal, we analyze the problems of existing storage models as well as relations of OWL elements and query patterns. The proposed model, defined with the analysis results, provides an optimal normalized structure to minimize join operations or unnecessary information comparison. For the experiment of query processing performance, a LUBM data sets are used and query patterns are defined considering search targets and their hierarchical relations. In addition, this paper conducts experiments on correctness and completeness of query results to verify data loss of the proposed model, and the results are described. With the comparative evaluation results, our proposal showed a better performance than the existing storage models.

Boolean Query Formulation From Korean Natural Language Queries using Syntactic Analysis (구문분석에 기반한 한글 자연어 질의로부터의 불리언 질의 생성)

  • Park, Mi-Hwa;Won, Hyeong-Seok;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1219-1229
    • /
    • 1999
  • 일반적으로 AND, OR, NOT과 같은 연산자를 사용하는 불리언 질의는 사용자의 검색의도를 정확하게 표현할 수 있기 때문에 검색 전문가들은 불리언 질의를 사용하여 높은 검색성능을 얻는다고 알려져 있지만, 일반 사용자는 자신이 원하는 정보를 불리언 형태로 표현하는데 익숙하지 않다. 본 논문에서는 검색성능의 향상과 사용자 편의성을 동시에 만족하기 위하여 사용자의 자연어 질의를 확장 불리언 질의로 자동 변환하는 방법론을 제안한다. 먼저 자연어 질의를 범주문법에 기반한 구문분석을 수행하여 구문트리를 생성하고 연산자 및 키워드 정보를 추출하여 구문트리를 간략화한다. 다음으로 간략화된 구문트리로부터 명사구를 합성하고 키워드들에 대한 가중치를 부여한 후 불리언 질의를 생성하여 검색을 수행한다. 또한 구문분석의 오류로 인한 검색성능 저하를 최소화하기 위하여 상위 N개 구문트리에 대해 각각 불리언 질의를 생성하여 검색하는 N-BEST average 방법을 제안하였다. 정보검색 실험용 데이타 모음인 KTSET2.0으로 실험한 결과 제안된 방법은 수동으로 추출한 불리언 질의보다 8% 더 우수한 성능을 보였고, 기존의 벡터공간 모델에 기반한 자연어질의 시스템에 비해 23% 성능향상을 보였다. Abstract There have been a considerable evidence that trained users can achieve a good search effectiveness through a boolean query because a structural boolean query containing operators such as AND, OR, and NOT can make a more accurate representation of user's information need. However, it is not easy for ordinary users to construct a boolean query using appropriate boolean operators. In this paper, we propose a boolean query formulation method that automatically transforms a user's natural language query into a extended boolean query for both effectiveness and user convenience. First, a user's natural language query is syntactically analyzed using KCCG(Korean Combinatory Categorial Grammar) parser and resulting syntactic trees are structurally simplified using a tree-simplifying mechanism in order to catch the logical relationships between keywords. Next, in a simplified tree, plausible noun phrases are identified and added into the same tree as new additional keywords. Finally, a simplified syntactic tree is automatically converted into a boolean query using some mapping rules and linguistic heuristics. We also propose an N-BEST average method that uses top N syntactic trees to compensate for bad effects of single incorrect top syntactic tree. In experiments using KTSET2.0, we showed that a proposed method outperformed a traditional vector space model by 23%, and surprisingly manually constructed boolean queries by 8%.

EP2 Labeling Scheme for XML Data (XML 데이타를 위한 EP2 레이블링 스킴)

  • 진주용;배진욱;이석호
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.79-81
    • /
    • 2004
  • 범위 기반 레이블링 스킴(range-based labeling scheme)을 이용하면 임의의 두 노드에 대한 조상-자손 관계를 쉽게 판별할 수 있으므로, XPath나 XQuery 형태의 질의를 효율적으로 처리할 수 있다. 그러나 노드의 삽입이 일어나는 동적인 상황에서는 불가피하게 전체 또는 일부의 레이블을 다시 할당(re-labeling)할 가능성이 있다는 문제점이 있다. 본 논문에서는 Dietz 레이블링 스킴을 개선한 EP2(extended preorder & postorder) 레이블링 스킴을 제안한다. 제안하는 스킴은 동일한 저장 공간상에서 범위 기반 레이블링 스킴에 비해 동적인 갱신에 유리하며, 기존의 구조 조인 알고리즘(structural join algorithm)을 이용하여 효율적으로 구조 질의(structural query)를 처리할 수 있다.

  • PDF

An XML Access Control Method through Filtering XPath Expressions (XPath 표현식의 필터링을 통한 XML 접근 제어 기법)

  • Jeon Jae-myeong;Chung Yon Dohn;Kim Myoung Ho;Lee Yoon Joon
    • Journal of KIISE:Databases
    • /
    • v.32 no.2
    • /
    • pp.193-203
    • /
    • 2005
  • XML (extensible Markup Language) is recognized as a standard of data representation and transmission on Internet. XPath is a standard for specifying parts of XML documents anda suitable language for both query processing and access control of XML. In this paper, we use the XPath expression for representing user queries and access control for XML. And we propose an access control method for XML, where we control accesses to XML documents by filtering query XPath expressions through access control XPath expressions. In the proposed method, we directly search XACT (XML Access Control Tree) for a query XPath expression and extract the access-granted parts. The XACT is our proposedstructure, where the edges are structural summary of XML elements and the nodes contain access-control information. We show the query XPath expressions are successfully filtered through the XACT by our proposed method, and also show the performance improvement by comparing the proposed method with the previous work.

Music Retrieval Using the Geometric Hashing Technique (기하학적 해싱 기법을 이용한 음악 검색)

  • Jung, Hyosook;Park, Seongbin
    • The Journal of Korean Association of Computer Education
    • /
    • v.8 no.5
    • /
    • pp.109-118
    • /
    • 2005
  • In this paper, we present a music retrieval system that compares the geometric structure of a melody specified by a user with those in a music database. The system finds matches between a query melody and melodies in the database by analyzing both structural and contextual features. The retrieval method is based on the geometric hashing algorithm which consists of two steps; the preprocessing step and the recognition step. During the preprocessing step, we divide a melody into several fragments and analyze the pitch and duration of each note of the fragments to find a structural feature. To find a contextual feature, we find a main chord for each fragment. During the recognition step, we divide the query melody specified by a user into several fragments and search through all fragments in the database that are structurally and contextually similar to the melody. A vote is cast for each of the fragments and the music whose total votes are the maximum is the music that contains a matching melody against the query melody. Using our approach, we can find similar melodies in a music database quickly. We can also apply the method to detect plagiarism in music.

  • PDF

A Minimum Sequence Matching Scheme for Efficient XPath Processing

  • Seo, Dong-Min;Yeo, Myung-Ho;Kim, Myoung-Ho;Yoo, Jae-Soo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.5
    • /
    • pp.492-506
    • /
    • 2009
  • Index structures that are based on sequence matching for XPath processing such as ViST, PRIX and LCS-TRIM have recently been proposed to reduce the search time of XML documents. However, ViST can cause a lot of unnecessary computation and I/O when processing structural joint queries because its numbering scheme is not optimized. PRIX and LCS-TRIM require much processing time for matching XML data trees and queries. In this paper, we propose a novel index structure that solves the problems of ViST and improves the performance of PRIX and LCS-TRIM. Our index structure provides the minimum sequence matching scheme to efficiently process structural queries. Finally, to verify the superiority of the proposed index structure with the minimum sequence matching scheme, we compare our index structure with ViST, PRIX and LCS-TRIM in terms of query processing of a single path or of a branching path including wild-cards ('*' and '//' ).