• Title/Summary/Keyword: pseudo relevance feedback

Search Result 14, Processing Time 0.02 seconds

Resampling Feedback Documents Using Overlapping Clusters (중첩 클러스터를 이용한 피드백 문서의 재샘플링 기법)

  • Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.16B no.3
    • /
    • pp.247-256
    • /
    • 2009
  • Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.

Document Summarization using Pseudo Relevance Feedback and Term Weighting (의사연관피드백과 용어 가중치에 의한 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.3
    • /
    • pp.533-540
    • /
    • 2012
  • In this paper, we propose a document summarization method using the pseudo relevance feedback and the term weighting based on semantic features. The proposed method can minimize the user intervention to use the pseudo relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature. In addition, it uses the semantic feature of term weighting and the expanded query to reduce the semantic gap between the user's requirement and the result of proposed method. The experimental results demonstrate that the proposed method achieves better performant than other methods without term weighting.

Personalized Document Snippet Extraction Method using Fuzzy Association and Pseudo Relevance Feedback (의사연관 피드백과 퍼지 연관을 이용한 개인화 문서 스니핏 추출 방법)

  • Park, Seon;Jo, Gwang-Mun;Yang, Hu-Yeol;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.137-142
    • /
    • 2012
  • Snippet is a summaries information of representing web pages which search engine provides user. Snippet and page rank in search engine abundantly influence user for visiting web pages. User sometime visits the wrong page with respect to user intention when uses snippet. The snippet extraction method is difficult to accurate comprehending user intention. In order to solve above problem, this paper proposes a new snippet extraction method using fuzzy association and pseudo relevance feedback. The proposed method uses pseudo relevance feedback to expand the use's query. It uses the fuzzy association between the expanded query and the web pages to extract snippet to be well reflected semantic user's intention. The experimental results demonstrate that the proposed method can achieve better snippet extraction performance than the other methods.

Query-based Document Summarization using Pseudo Relevance Feedback based on Semantic Features and WordNet (의미특징과 워드넷 기반의 의사 연관 피드백을 사용한 질의기반 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.7
    • /
    • pp.1517-1524
    • /
    • 2011
  • In this paper, a new document summarization method, which uses the semantic features and the pseudo relevance feedback (PRF) by using WordNet, is introduced to extract meaningful sentences relevant to a user query. The proposed method can improve the quality of document summaries because the inherent semantic of the documents are well reflected by the semantic feature from NMF. In addition, it uses the PRF by the semantic features and WordNet to reduce the semantic gap between the high level user's requirement and the low level vector representation. The experimental results demonstrate that the proposed method achieves better performance that the other methods.

Word Embeddings-Based Pseudo Relevance Feedback Using Deep Averaging Networks for Arabic Document Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah;Atwan, Jaffar
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.2
    • /
    • pp.1-17
    • /
    • 2021
  • Pseudo relevance feedback (PRF) is a powerful query expansion (QE) technique that prepares queries using the top k pseudorelevant documents and choosing expansion elements. Traditional PRF frameworks have robustly handled vocabulary mismatch corresponding to user queries and pertinent documents; nevertheless, expansion elements are chosen, disregarding similarity to the original query's elements. Word embedding (WE) schemes comprise techniques of significant interest concerning QE, that falls within the information retrieval domain. Deep averaging networks (DANs) defines a framework relying on average word presence passed through multiple linear layers. The complete query is understandably represented using the average vector comprising the query terms. The vector may be employed for determining expansion elements pertinent to the entire query. In this study, we suggest a DANs-based technique that augments PRF frameworks by integrating WE similarities to facilitate Arabic information retrieval. The technique is based on the fundamental that the top pseudo-relevant document set is assessed to determine candidate element distribution and select expansion terms appropriately, considering their similarity to the average vector representing the initial query elements. The Word2Vec model is selected for executing the experiments on a standard Arabic TREC 2001/2002 set. The majority of the evaluations indicate that the PRF implementation in the present study offers a significant performance improvement compared to that of the baseline PRF frameworks.

A New Approach of Domain Dictionary Generation

  • Xi, Su Mei;Cho, Young-Im;Gao, Qian
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.15-19
    • /
    • 2012
  • A Domain Dictionary generation algorithm based on pseudo feedback model is presented in this paper. This algorithm can increase the precision of domain dictionary generation algorithm. The generation of Domain Dictionary is regarded as a domain term retrieval process: Assume that top N strings in the original retrieval result set are relevant to C, append these strings into the dictionary, retrieval again. Iterate the process until a predefined number of domain terms have been generated. Experiments upon corpus show that the precision of pseudo feedback model based algorithm is much higher than existing algorithms.

An Effective Snippet Generation Method using Text Summarization Techniques based on Pseudo Relevance Feedback (유사 적합성 피드백 기반의 문서 요약 기법을 이용한 효과적인 스니펫 생성)

  • An, Hong-Guk;Ko, Young-Joong;Seo, Jung-Yun
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.174-181
    • /
    • 2007
  • 정보 검색의 결과로 나타나는 요약문을 스니펫(snippet)이라 한다. 사용자는 자신이 원하는 정보를 얻기 위해 문서를 검색하는데, 이 때 스니펫은 사용자가 원하는 문서를 찾는데 중요한 역할을 한다. 본 논문에서는 정보검색 분야에서 높은 성능을 보이는 유사 적합성 피드백을 자동 문서 요약에 맞게 적용하여 높은 성능의 스니펫 생성 시스템을 구현한다. 우선, 사용자의 질의가 포함된 문장들을 일차적으로 요약 문장 후보로 추출한다. 그리고 추출된 문장 후보로부터 명사들을 질의 후보로 고려한다. 각 문장이 질의의 포함 여부에 따라 문장의 적합성을 판단하게 되고, 유사 적합성 피드백 확률 모델에 적용한 후 질의 후보들의 가중치를 추정하여 가중치 순위를 통해 확장할 질의들을 결정한다. 확장된 질의들과 기존의 질의들의 가중치를 합산하여 각 문장의 순위를 매기게 되고 가장 높은 순위의 문장들이 스니펫으로 제시된다. 논문에서 제안한 기법은 추가적인 핵심 질의들을 자동으로 확장하여 중요한 문장을 추출할 수 있다. 이 연구를 위해서 일반 상용 정보 검색 서비스에서 제공하는 스니펫을 수집하였고 이들의 정확도와 시스템의 정확도를 비교하였다. 실험 결과를 통해 살펴본 제안된 시스템의 성능은 상용 정보 검색기에서 제공되고 잇는 스니펫의 정확도 보다 우수한 성능을 보였다.

  • PDF

Snippet Extraction Method using Fuzzy (퍼지를 이용한 스니핏 추출 방법)

  • Park, Sun;Choi, Myeong Su;Kim, Cheong Ho;Kim, Cheong Uck;Na, Hee Kun;Choi, Seock Whan;Kumar, Shiu;Lee, Seong Ro
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.387-388
    • /
    • 2012
  • In order to solve problem which User sometime visits the wrong page with respect to user intention when uses snippet. this paper proposes a new snippet extraction method using fuzzy. The proposed method uses pseudo relevance feedback to expand the use's query. It uses the fuzzy association between the expanded query and the web pages to extract snippet to be well reflected semantic user's intention.

  • PDF

Enhancing Snippet Extraction Method using Fuzzy and Semantic Features (퍼지와 의미특징을 이용한 스니핏 추출 향상 방법)

  • Park, Sun;Lee, Yeonwoo;Cho, Kwangmoon;Yang, Huyeol;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.11
    • /
    • pp.2374-2381
    • /
    • 2012
  • This paper proposes a new enhancing snippet extraction method using fuzzy and semantic features. The proposed method creates a delegate of sentence by using semantic features. It extracts snippet using fuzzy association between a delegate sentence and sentence set which well represents query. In addition, the method uses pseudo relevance feedback to expand query which extracts snippet to be well reflected semantic user's intention. The experimental results demonstrate the proposed method can achieve better snippet extraction performance than the previous methods.

Automatic Text Summarization Using Query Expansion (질의확장을 이용한 자동 문서요약)

  • 한경수;백대호;임해창
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.339-341
    • /
    • 2000
  • 문서요약이란 문서의 기본적인 내용을 유지하면서 문서의 복잡도를 줄이는 작업이다. 인터넷과 같은 정보기술의 발달로 정보의 양이 급증함에 따라, 정보 과적재(information over load) 문제의 해결을 위해 자동 문서요약시스템의 필요성이 대두되었다. 본 논문에서는 의사 적합성 피드백(pseudo relevance feedback)에 의한 질의확장(query expansion) 기법을 적용한 자동 문서요약 모델을 제안한다. 제안하는 모델의 특징은 질의를 분해함으로써, 적합성 피드백 과정에서 질의가 편향(bias)되어 요약이 잘못되는 문제를 방지할 수 있다는 것이다. 신문기사를 대상으로 평가한 결과 제안한 모델이 질의확장을 적용하지 않은 방법이나 하나의 질의만을 유지하는 일반적인 적합성 피드백 모델보다 더 좋은 성능을 보였다.

  • PDF