• Title/Summary/Keyword: Boolean query

Search Result 44, Processing Time 0.024 seconds

Conjunctive Boolean Query Optimization based on Join Sequence Separability in Information Retrieval Systems (정보검색시스템에서 조인 시퀀스 분리성 기반 논리곱 불리언 질의 최적화)

  • 박병권;한욱신;황규영
    • Journal of KIISE:Databases
    • /
    • v.31 no.4
    • /
    • pp.395-408
    • /
    • 2004
  • A conjunctive Boolean text query refers to a query that searches for tort documents containing all of the specified keywords, and is the most frequently used query form in information retrieval systems. Typically, the query specifies a long list of keywords for better precision, and in this case, the order of keyword processing has a significant impact on the query speed. Currently known approaches to this ordering are based on heuristics and, therefore, cannot guarantee an optimal ordering. We can use a systematic approach by leveraging a database query processing algorithm like the dynamic programming, but it is not suitable for a text query with a typically long list of keywords because of the algorithm's exponential run-time (Ο(n2$^{n-1}$)) for n keywords. Considering these problems, we propose a new approach based on a property called the join sequence separability. This property states that the optimal join sequence is separable into two subsequences of different join methods under a certain condition on the joined relations, and this property enables us to find a globally optimal join sequence in Ο(n2$^{n-1}$). In this paper we describe the property formally, present an optimization algorithm based on the property, prove that the algorithm finds an optimal join sequence, and validate our approach through simulation using an analytic cost model. Comparison with the heuristic text query optimization approaches shows a maximum of 100 times faster query processing, and comparison with the dynamic programming approach shows exponentially faster query optimization (e.g., 600 times for a 10-keyword query).

Intelligent Query Analysis using Fuzzy Association Rule (퍼지 연관규칙을 이용한 지능적 질의해석)

  • Kim, Mi-Hye
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.6
    • /
    • pp.2214-2218
    • /
    • 2010
  • Association rule is one of meaning and useful extraction methods from large amounts of data, and furnish useful information to user for data describing a pattern or similarity among attributes in database. Association rule have been studied about existence and nonexistence rule in boolean database. In this paper, we propose an intelligent query system using fuzzy association rule by extraction association rule changing a quantitative attribute data to a nominal attribute value.

A Study on Document Retrieval of Web Using Relevance Feedback (적합성 피드백을 이용한 웹 문서검색에 관한 연구)

  • 김영천;이성주
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.3
    • /
    • pp.597-604
    • /
    • 2001
  • In conventional boolean retrieval systems, document ranking is not supported and similarity coefficients cannot be computed between queries and documents. The MMM, Paice and P-norm models have been proposed in the past to support the ranking facility for boolean retrieval systems. They have common properties of interpreting boolean operators softly. In this paper we propose a new soft evaluation method for Information retrieval using query splitting relevance feedback model. We also show through performance comparison that query splitting relevance feedback(QSRF) is more efficient and effective than MMM, Paice and P-norm.

  • PDF

Mathematical Properties of the Formulas Evaluating Boolean Operators in Information Retrieval (정보검색에서 부울연산자를 연산하는 식의 수학적 특성)

  • 이준호;이기호;조영화
    • Journal of the Korean Society for information Management
    • /
    • v.12 no.1
    • /
    • pp.87-97
    • /
    • 1995
  • Boolean retrieval systems have been most widely used in the area of information retrieval due to easy implementation and efficient retrieval. Conventional Boolean retrieval systems. however, cannot rank retrieved documents in decreasing order of query-document similarities because they cannot compute similarity coefficients between queries and documents. Extended Boolean models such as fuzzy set. Waller-Kraft, Paice, P-Norm and Infinite-One have been developed to provide the document ranking facility. In extended Boolean models, the formulas evaluating Boolean operators AND and OR are an important component to affect the quality of document ranking. In this paper we present mathematical properties of the formulas, and analyse their effect on retrieval effectiveness. Our analyses show that P-Norm is the most suitable for achieving high retrieval effectiveness.

  • PDF

The Design of Retrieval System Using Fuzzy Logic (퍼지 논리(論理)를 이용한 정보검색(情報檢索) 시스템의 설계(設計))

  • Cho, Hye-Min
    • Journal of Information Management
    • /
    • v.24 no.3
    • /
    • pp.73-100
    • /
    • 1993
  • In attempting to respond to boolean retrieval system's limitations, this paper presents the design of a retrieval system using fuzzy logic. The fuzzy retrieval system introduces the weights of terms in the documents and in the query and makes use of them to determine how much relevant a document is to the given query. After comparing and analyzing the previous researches, an effective model of the fuzzy retrieval system is suggested and the performance of the system is evaluated through actual examples.

  • PDF

(A Study of an Exact Match and a Partial Match as an Information Retrieval Technique) (완전 매치와 부분 매치 검색 기법에 관한 연구)

  • 김영귀
    • Journal of the Korean Society for information Management
    • /
    • v.7 no.1
    • /
    • pp.79-95
    • /
    • 1990
  • A retrieval technique was defined as a technique for comparing the document representations. So this study classified retrieval technique in terms of the charactristics of the retrieved set of documents and the representations that are used. The distinction is whether the set of retrieved documents contains only documents whose representations are an exact match with the query, or a partial match with query. For a partial match, the set of retrieved document will include also those that are an exact match with the query. Boolean-logic as one of the exact match retrieval techniques is in current in most of the large operational information retrieval systems despite of its problems and limitatlons. Partial match as an alternative technique has also various problems. Existing information retrieval systems are successful in aSSisting the user whose needs are well- defined (e.g. Boolean-logic), to retrieve relevant documents but it should be successful in providing retrieval assistance to the browser whose information requirements is ill-defined.

  • PDF

Efficient Query Expansion Method using Fuzzy Thesaurus in Component Retrieval (컴포넌트 검색에서 퍼지 시소러스를 이용한 효율적인 질의확장 방법)

  • 김귀정;한정수
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.1
    • /
    • pp.76-82
    • /
    • 2004
  • In this paper, we used query evaluation method through thesaurus for retrieving Components having concept relation with any classes in a query. Queries are presented in boolean and expanded by similar table. Query expansion by thesaurus is the solution of the term mismatching and it enhanced precision and recall of the components retrieval. For efficiency evaluation of query expansion, we defined most critical value through a simulation and compared precision and recall each other.

  • PDF

Web Information Retrieval based on Natural Language Query Analysis and Keyword Expansion (자연어 질의 분석과 검색어 확장에 기반한 웹 정보 검색)

  • 윤성희;장혜진
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.2
    • /
    • pp.235-248
    • /
    • 2004
  • For the users of information retrieval systems, natural language query is the more ideal interface, compared with keyword and boolean expressions. This paper proposes a retrieval technique with expanded keyword from syntactically-analyzed structures of natural language query as user input. Through the steps combining or splitting the compound nouns based on syntactic tree traversal of the query, and expanding the other-formed or shorten-formed into multiple keyword, it can enhance the precision and correctness of the retrieval system.