• Title/Summary/Keyword: query Expansion

Search Result 130, Processing Time 0.058 seconds

XML Document Selection and Query Expansion Modules (XML 문서선별과 질의확장을 위한 자동화 모듈 개발)

  • 김명숙;권혁돈;공용해
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2004.05a
    • /
    • pp.455-458
    • /
    • 2004
  • 본 연구는 다양한 형식을 가지는 XML 문서의 효율적인 정보검색을 위한 다음과 같은 자동화 모듈들을 개발하였다 구현된 모듈은 XML 문서를 획득하는 문서추출 모듈, 온톨로지를 이용한 포괄적 DTD 생성 모듈 생성된 포괄적 DTD와 XML 파서를 이용하여 정보검색 대상 XML 문서를 사전에 선별하는 문서여과 모듈, XML 질의를 확장하는 질의확장 모듈, JDOM의 XPath를 이용한 질의엔진 모듈로 구성된다. 이와 같이 구현한 모듈들을 샘플 XML 문서에 적용하여 XML 문서추출, DTD 생성, 문서여과, 질의확장, 질의엔진의 효과를 실험하였다.

  • PDF

Enhancing the Narrow-down Approach to Large-scale Hierarchical Text Classification with Category Path Information

  • Oh, Heung-Seon;Jung, Yuchul
    • Journal of Information Science Theory and Practice
    • /
    • v.5 no.3
    • /
    • pp.31-47
    • /
    • 2017
  • The narrow-down approach, separately composed of search and classification stages, is an effective way of dealing with large-scale hierarchical text classification. Recent approaches introduce methods of incorporating global, local, and path information extracted from web taxonomies in the classification stage. Meanwhile, in the case of utilizing path information, there have been few efforts to address existing limitations and develop more sophisticated methods. In this paper, we propose an expansion method to effectively exploit category path information based on the observation that the existing method is exposed to a term mismatch problem and low discrimination power due to insufficient path information. The key idea of our method is to utilize relevant information not presented on category paths by adding more useful words. We evaluate the effectiveness of our method on state-of-the art narrow-down methods and report the results with in-depth analysis.

Term Distribution Threshold Models for Information Retrieval (정보 검색을 위한 용어 분표 임계치 모델)

  • Im, Jae-Hyeon;Min, Tae-Hong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.5
    • /
    • pp.1482-1490
    • /
    • 2000
  • With the increasing availability of information in electronic form, it becomes more important and feasible to have automatic methods to retrieve relevant information in the Internet. A deficiency of traditional information retrieval systems is that search terms are often different from those indexed by the systems. Thus, users ma either retrieve wrong information or miss what they really want. In this paper, e used an automatic query expansion based expansion based on term distribution to enhance the performance of information retrieval. Also this thesis proposed the method for setting the threshold according to area distribution in order choose additional terms.

  • PDF

Building Thesaurus for Science & Technology Domain Using Facets and Its Application to Inference Services (패싯(Facet)을 이용한 과학기술분야 시소러스 구축과 활용방안)

  • Hwang, Soon-Hee;Jung, Han-Min;Sung, Won-Kyung
    • Journal of Information Management
    • /
    • v.37 no.3
    • /
    • pp.61-84
    • /
    • 2006
  • In this paper, we proposed one of the methods for building thesaurus in Science & Technology domain and investigated its applicability as an inference service based on ontology. There exist as many building methods for thesaurus as its role and function, and actually many thesauri capable of ensuring the accuracy and efficiency in information search are being built by many experts. After examining the previous studies related to the principles of building thesaurus and relevant concept "facet", we focused on its characteristics and applied it to building thesaurus. The facet is classified into 2 categories, conceptual facet and relational facet. The latter contains 3 subcategories: category relational facet, attribute relational facet and thematic relational facet. The thesaurus for Science & Technology domain using facets can be applied as a web-based inference service. As a result, the three types of inference service, COP(Communities of Practice), Researcher Tracing and Research Map are provided by means of ontology, and can be applied for the Query Expansion.

Knowledge-based Semantic Meta-Search Engine (지식기반 의미 메타 검색엔진)

  • Lee, In-K.;Son, Seo-H.;Kwon, Soon-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.6
    • /
    • pp.737-744
    • /
    • 2004
  • Retrieving relevant information well corresponding to the user`s request from web is a crucial task of search engines. However, most of conventional search engines based on pattern matching schemes to queries have a limitation that is not easy to provide results corresponding to the user`s request due to the uncertainty of queries. To overcome the limitation in this paper, we propose a framework for knowledge-based semantic meta-search engines with the following five processes: (i) Query formation, (ii) Query expansion, (iii) Searching, (iv) Ranking recreation, and (v) Knowledge base. From simulation results on english-based web documents, we can see that the Proposed knowledge-based semantic meta-search engine provides more correct and better searching results than those obtained by using the Google.

Semantic Information Retrieval Based on User-Word Intelligent Network (U-WIN 기반의 의미적 정보검색 기술)

  • Im, Ji-Hui;Choi, Ho-Seop;Ock, Cheol-Young
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.547-550
    • /
    • 2006
  • The criterion which judges an information retrieval system performance is to how many accurately retrieve an information that the user wants. The search result which uses only homograph has been appears the various documents that relates to each meaning of the word or intensively appears the documents that relates to specific meaning of it. So in this paper, we suggest semantic information retrieval technique using relation within User-Word Intelligent Network(U-WIN) to solve a disambiguation of query In our experiment, queries divide into two classes, the homograph used in terminology and the general homograph, and it sets the expansion query forms at "query + hypemym". Thus we found that only web document search's precision is average 73.5% and integrated search's precision is average 70% in two portal site. It means that U-WIN-Based semantic information retrieval technique can be used efficiently for a IR system.

  • PDF

An Efficient Concurrency Control Algorithm for Multi-dimensional Index Structures (다차원 색인구조를 위한 효율적인 동시성 제어기법)

  • 김영호;송석일;유재수
    • Journal of KIISE:Databases
    • /
    • v.30 no.1
    • /
    • pp.80-94
    • /
    • 2003
  • In this paper. we propose an enhanced concurrency control algorithm that minimizes the query delay efficiently. The factors that delay search operations and deteriorate the concurrency of index structures are node splits and MBR updates in multi dimensional index structures. In our algorithm, to reduce the query delay by split operations, we optimize exclusive latching time on a split node. It holds exclusive latches not during whole split time but only during physical node split time that occupies small part of whole split time. Also to avoid the query delay by MBR updates we introduce partial lock coupling(PLC) technique. The PLC technique increases concurrency by using lock coupling only in case of MBR shrinking operations that are less frequent than MBR expansion operations. For performance evaluation, we implement the proposed algorithm and one of the existing link technique-based algorithms on MIDAS-III that is a storage system of a BADA-III DBMS. We show through various experiments that our proposed algorithm outperforms the existing algorithm In terms of throughput and response time.

Pattern Analysis-Based Query Expansion for Enhancing Search Convenience (검색 편의성 향상을 위한 패턴 분석 기반 질의어 확장)

  • Jeon, Seo-In;Park, Gun-Woo;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.2
    • /
    • pp.65-72
    • /
    • 2012
  • In the 21st century of information systems, the amount of information resources are ever increasing and the role of information searching system is becoming criticalto easily acquire required information from the web. Generally, it requires the user to have enough pre-knowledge and superior capabilities to identify keywords of information to effectively search the web. However, most of the users undertake searching of the information without holding enough pre-knowledge and spend a lot of time associating key words which are related to their required information. Furthermore, many search engines support the keywords searching system but this only provides collection of similar words, and do not provide the user with exact relational search information with the keywords. Therefore this research report proposes a method of offering expanded user relationship search keywords by analyzing user query patterns to provide the user a system, which conveniently support their searching of the information.

A Study on Query Refinement by Online Relevance Feedback in an Information Filtering System (온라인 이용자 피드백을 사용한 정보필터링 시스템의 수정질의 최적화에 관한 연구)

  • Choi, Kwang;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.20 no.4 s.50
    • /
    • pp.23-48
    • /
    • 2003
  • In this study an information filtering system was implemented and a series of relevance feedback experiments were conducted using the system. For the relevance feedback, the original queries were searched against the database and the results were reviewed by the researchers. Based on users' online relevance judgements a pair of 17 refined queries were generated using two methods called 'co-occurrence exclusion method' and 'lower frequencies exclusion method,' In order to generate them, the original queries, the descriptors and category codes appeared in either relevant or irrelevant document sets were applied as elements. Users' relevance judgments on the search results of the refined queries were compared and analyzed against those of the original queries.

A Reranking Method Using Query Expansion and PageRank Check (페이지 랭크지수와 질의 확장을 이용한 재랭킹 방법)

  • Kim, Tae-Hwan;Jeon, Ho-Chul;Choi, Joong-Min
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.231-240
    • /
    • 2011
  • Many search algorithms have been implemented by many researchers on the world wide web. One of the best algorithms is Google using PageRank technology. PageRank approach computes the number of inlink of each documents then ranks documents in the order of inlink members. But it is difficult to find the results that user needs, because this method find documents not valueable for a person but valueable for the public. To solve this problem, We use the WordNet for analysis of the user's query history. This paper proposes a personalized search engine using the user's query history and PageRank Check. We compared the performance of the proposed approaches with google search results in the top 30. As a result, the average of the r-precision for the proposed approaches is about 60% and it is better as about 14%.