• Title/Summary/Keyword: query formulation

Search Result 18, Processing Time 0.02 seconds

Boolean Query Formulation From Korean Natural Language Queries using Syntactic Analysis (구문분석에 기반한 한글 자연어 질의로부터의 불리언 질의 생성)

  • Park, Mi-Hwa;Won, Hyeong-Seok;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1219-1229
    • /
    • 1999
  • 일반적으로 AND, OR, NOT과 같은 연산자를 사용하는 불리언 질의는 사용자의 검색의도를 정확하게 표현할 수 있기 때문에 검색 전문가들은 불리언 질의를 사용하여 높은 검색성능을 얻는다고 알려져 있지만, 일반 사용자는 자신이 원하는 정보를 불리언 형태로 표현하는데 익숙하지 않다. 본 논문에서는 검색성능의 향상과 사용자 편의성을 동시에 만족하기 위하여 사용자의 자연어 질의를 확장 불리언 질의로 자동 변환하는 방법론을 제안한다. 먼저 자연어 질의를 범주문법에 기반한 구문분석을 수행하여 구문트리를 생성하고 연산자 및 키워드 정보를 추출하여 구문트리를 간략화한다. 다음으로 간략화된 구문트리로부터 명사구를 합성하고 키워드들에 대한 가중치를 부여한 후 불리언 질의를 생성하여 검색을 수행한다. 또한 구문분석의 오류로 인한 검색성능 저하를 최소화하기 위하여 상위 N개 구문트리에 대해 각각 불리언 질의를 생성하여 검색하는 N-BEST average 방법을 제안하였다. 정보검색 실험용 데이타 모음인 KTSET2.0으로 실험한 결과 제안된 방법은 수동으로 추출한 불리언 질의보다 8% 더 우수한 성능을 보였고, 기존의 벡터공간 모델에 기반한 자연어질의 시스템에 비해 23% 성능향상을 보였다. Abstract There have been a considerable evidence that trained users can achieve a good search effectiveness through a boolean query because a structural boolean query containing operators such as AND, OR, and NOT can make a more accurate representation of user's information need. However, it is not easy for ordinary users to construct a boolean query using appropriate boolean operators. In this paper, we propose a boolean query formulation method that automatically transforms a user's natural language query into a extended boolean query for both effectiveness and user convenience. First, a user's natural language query is syntactically analyzed using KCCG(Korean Combinatory Categorial Grammar) parser and resulting syntactic trees are structurally simplified using a tree-simplifying mechanism in order to catch the logical relationships between keywords. Next, in a simplified tree, plausible noun phrases are identified and added into the same tree as new additional keywords. Finally, a simplified syntactic tree is automatically converted into a boolean query using some mapping rules and linguistic heuristics. We also propose an N-BEST average method that uses top N syntactic trees to compensate for bad effects of single incorrect top syntactic tree. In experiments using KTSET2.0, we showed that a proposed method outperformed a traditional vector space model by 23%, and surprisingly manually constructed boolean queries by 8%.

Intermediary Systems for Bibliographic Information Retrieval

  • Yoo, Ja Kyung
    • Journal of the Korean Society for information Management
    • /
    • v.2 no.2
    • /
    • pp.38-70
    • /
    • 1985
  • The purpose of this paper is to provide a review of the literature on the role of end-user intermediary systems in information retrieval. The paper starts with an introduction pointing out the problems involved in conventional retrieval system. The next section covers the major developments in the field of intermediary systems including natural language processing, automatic query formulation, relevance feedback, and automatic query refinement. The paper concludes with a general overview of the current state of the art and its future implications in information retrieval.

  • PDF

Developing a direct manipulation-based interface to OPAC system using term relevance feedback technique (용어적합성피드백기반-OPAC시스템에 대한 직접조작의 인터페이스 구축)

  • 이영자
    • Journal of Korean Library and Information Science Society
    • /
    • v.26
    • /
    • pp.365-400
    • /
    • 1997
  • The interface design for most present query-base model of OPAC systems does not include the function to implement an iterative feedback process till the user arrives at satisfied search results through the interaction with the system. Also, the interface doesn't provide the help function for a user to select pertinent search terms. To formulate a query at the present OPAC system, a user should learn a set of syntax different from system to system. All of above mentioned things make an end-user feel difficult to utilize an OPAC system effectively. This experimental system is attempted to alleviate a few limitations of the present OPAC system by a n.0, pplying the direct-manipulation technique as well as the feedback principle. First, this system makes it unnecessary for a user to learn some syntax for query formulation by providing option buttons for access points. Second, this system makes it possible for a user to decide whether each displayed record is relevant or not, and for keywords included in the relevant records to be automatically stored in order to be used for later feedback. Third, in this system, the keywords stored in [sayongja yongeu bogyanham] can be deleted if unnecessary or can selected as search terms for a query expansion as well as a query modification. Fourth, in this system, after inputting the original query, the feedback process can be proceed without coming back to the previous search step until a user becomes satisfied with the search results. In conclusion, the searching behaviors of heterogeneous users should be continuously observed, analysed, and studied, the findings of which should be integrated into the design for the interface of the OPAC system.

  • PDF

Experimental development of hypertext-based thesaurus (하이퍼텍스트 기반 디소러스의 실험적 설계와 운용)

  • 노진구
    • Journal of Korean Library and Information Science Society
    • /
    • v.22
    • /
    • pp.373-401
    • /
    • 1995
  • This study aims to improve subject retrieval by constructing the hypertext-based thesaurus to provide a browsing interface to a thesaurus. The experimental system used IBM 486 DXII as a hardware, C++ language as a programming language and Hangul Window 3.1 version as a user interface. The results of this study are summarized as follows : (1) The experimental hypertext-based thesaurus can be used as an efficient search aid for query formulation for the retrieval of bibliographic information. (2) The initial access to the hypertext-based thesaurus is via a keyword index. This index is consist of all the words used to form thesaurus terms, whether descriptors or nondescriptors. (3) Hypertext-based thesaurus allow bookmark button and history button to alleviating the problem of disorientation. (4) This system allow an end-user to view a rich variety of inter-term relationships and a complete conceptus of associations through the information space in a nonsequential manner.

  • PDF

Analysis of Herbal combination frequence on Clicical Herbal formulation (임상한의사 처방의 약물 배합 빈도 분석)

  • Cha, Woong-Seok;Lee, Tae-Hyung;Lee, Byung-Wook
    • Herbal Formula Science
    • /
    • v.19 no.2
    • /
    • pp.1-10
    • /
    • 2011
  • Objectives : Since its enactment in 1987, the 56 standard prescriptions covered by insurance have remained unchanged from its original version. In this study, we tried to discover most frequently used herbal combinations by analyzing prescriptions used in actual clinical settings. Methods : We have built Structured Query Language to analyze herbal combination and progressed this analysis through analyzing the frequencies of medicinal herb combinations in medical prescription slips. Results : We have found out that traditional Korean medical doctors use about 13 herbs in a prescriptions and usually use 253 kinds of herb. And We have found out the most frequently used herbal combination. Conclusions : In this study, We can suggest new method to decide what do we need on insurance prescriptions.

The Study on the Search Mechanism in Digital Libraries (디지털 도서관의 탐색 메카니즘에 관한 연구)

  • 김선호
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.10 no.1
    • /
    • pp.163-174
    • /
    • 1999
  • The purpose of this study is to research and analyse the end user's satisfactions concerning the architecture, design, format, terminology, query formulation, hits, that is, the primary factors of the search mechanism in digital libraries. and then, to present its improvements. The search mechanism of National Digital Library in Korea is decided as the sample, and the 80 students who majored in the library and information science are selected as subjects. The end user's satisfactions are measured by questionnaire.

  • PDF

A Study on the Effects of Search Language on Web Searching Behavior: Focused on the Differences of Web Searching Pattern (검색 언어가 웹 정보검색행위에 미치는 영향에 관한 연구 - 웹 정보검색행위의 양상 차이를 중심으로 -)

  • Byun, Jeayeon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.52 no.3
    • /
    • pp.289-334
    • /
    • 2018
  • Even though information in many languages other than English is quickly increasing, English is still playing the role of the lingua franca and being accounted for the largest proportion on the web. Therefore, it is necessary to investigate the key features and differences between "information searching behavior using mother tongue as a search language" and "information searching behavior using English as a search language" of users who are non-mother tongue speakers of English to acquire more diverse and abundant information. This study conducted the experiment on the web searching which is applied in concurrent think-aloud method to examine the information searching behavior and the cognitive process in Korean search and English search through the twenty-four undergraduate students at a private university in South Korea. Based on the qualitative data, this study applied the frequency analysis to web search pattern under search language. As a result, it is active, aggressive and independent information searching behavior in Korean search, while information searching behavior in English search is passive, submissive and dependent. In Korean search, the main features are the query formulation by extract and combine the terms from various sources such as users, tasks and system, the search range adjustment in diverse level, the smooth filtering of the item selection in search engine results pages, the exploration and comparison of many items and the browsing of the overall contents of web pages. Whereas, in English search, the main features are the query formulation by the terms principally extracted from task, the search range adjustment in limitative level, the item selection by rely on the relevance between the items such as categories or links, the repetitive exploring on same item, the browsing of partial contents of web pages and the frequent use of language support tools like dictionaries or translators.

An Electronic Dictionary Structure supporting Truncation Search (절단검색을 지원하는 전자사전 구조)

  • 김철수
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.1
    • /
    • pp.60-69
    • /
    • 2003
  • In an Information Retrieval System(IRS) based on an inverted file as a file structure it is possible to retrieve related documents when the searcher know the complete words of searching fields. however, there are many cases in which the searcher may not know the complete words but a partial string of words with which to search. In this case, if the searcher can search indexes that include the known partial string, it is possible to retrieve related documents. Futhermore, when the retrieved documents are few, we need a method to find all documents having indexes which include known the partial string. To satisfy these requests, the searcher should be able to construct a query formulation that uses the term truncation method. Also the IRS should have an electronic dictionary that can support a truncated search term. This paper designs and implements an electronic dictionary(ED) structure to support a truncation search efficiently. The ED guarantees very fast and constant searching time for searching a term entry and the inversely alphabetized entry of it, regardless of the number of inserted words. In order to support a truncation search efficiently, we use the Trie structure and in order to accommodate fast searching time we use a method using array. In the searching process of a truncated term, we can reduce the searching time by minimizing the length of string to be expanded.