Search | Korea Science

Jo, Seung-Hyeon;Lee, Kyung-Soon
- The KIPS Transactions:PartB
- /
- v.19B no.3
- /
- pp.189-194
- /
- 2012
In this paper, we propose a query expansion method based on word graphs using pseudo-relevant and pseudo non-relevant documents to achieve performance improvement in information retrieval. The initially retrieved documents are classified into a core cluster when a document includes core query terms extracted by query term combinations and the degree of query term proximity. Otherwise, documents are classified into a non-core cluster. The documents that belong to a core query cluster can be seen as pseudo-relevant documents, and the documents that belong to a non-core cluster can be seen as pseudo non-relevant documents. Each cluster is represented as a graph which has nodes and edges. Each node represents a term and each edge represents proximity between the term and a query term. The term weight is calculated by subtracting the term weight in the non-core cluster graph from the term weight in the core cluster graph. It means that a term with a high weight in a non-core cluster graph should not be considered as an expanded term. Expansion terms are selected according to the term weights. Experimental results on TREC WT10g test collection show that the proposed method achieves 9.4% improvement over the language model in mean average precision.
https://doi.org/10.3745/KIPSTB.2012.19B.3.189 인용 PDF KSCI

Kim, Chul-Won;Park, Sun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.15 no.7
- /
- pp.1517-1524
- /
- 2011
In this paper, a new document summarization method, which uses the semantic features and the pseudo relevance feedback (PRF) by using WordNet, is introduced to extract meaningful sentences relevant to a user query. The proposed method can improve the quality of document summaries because the inherent semantic of the documents are well reflected by the semantic feature from NMF. In addition, it uses the PRF by the semantic features and WordNet to reduce the semantic gap between the high level user's requirement and the low level vector representation. The experimental results demonstrate that the proposed method achieves better performance that the other methods.
https://doi.org/10.6109/jkiice.2011.15.7.1517 인용 PDF KSCI