Search | Korea Science

Kang, Bo-Yeong;Kim, Dae-Won;Gu, Sang-Ok;Lee, Sang-Jo
- Proceedings of the Korean Information Science Society Conference
- /
- 2002.04b
- /
- pp.238-240
- /
- 2002
If there is an Information Retrieval system which comprehends the semantic content of documents and knows the preference of users. the system can search the information better on the Internet, or improve the IR performance. Therefore we propose the IR model which combines semantic based indexing and fuzzy relevance model. In addition to the statistical approach, we chose the semantic approach in indexing, lexical chains, because we assume it would improve the performance of the index term extraction. Furthermore, we combined the semantic based indexing with the fuzzy model, which finds out the exact relevance of the user preference and index terms. The proposed system works as follows: First, the presented system indexes documents by the efficient index term extraction method using lexical chains. And then, if a user tends to retrieve the information from the indexed document collection, the extended IR model calculates and ranks the relevance of user query. user preference and index terms by some metrics. When we experimented each module, semantic based indexing and extended fuzzy model. it gave noticeable results. The combination of these modules is expected to improve the information retrieval performance.
PDF

Kim, Jae-Soo;Nam, Young-Joon
- Journal of Information Management
- /
- v.23 no.2
- /
- pp.1-13
- /
- 1992
This study intends to measure the indexibility and the information quantity in title and abstract. The result of analysis was that when the source was title or abstract, result was not good. But when it was the title and abstract, the result was better.
PDF

Jeong, Hye-Jin;Kim, Yong-Sung
- The KIPS Transactions:PartB
- /
- v.16B no.1
- /
- pp.71-78
- /
- 2009
For more effective index extraction and index weight determination, studies of extracting indices are carried out by using document content as well as structure. However, most of studies are concentrating in calculating the importance of context rather than that of XML tag. These conventional studies determine its importance from the aspect of common sense rather than verifying that through an objective experiment. This paper, for the automatic indexing by using the tag information of XML document that has taken its place as the standard for web document management, classifies major tags of constructing a paper according to its importance and calculates the term weight extracted from the tag of low weight. By using the weight obtained, this paper proposes a method of calculating the final weight while updating the term weight extracted from the tag of high weight. In order to determine more objective weight, this paper tests the tag that user considers as important and reflects it in calculating the weight by classifying its importance according to the result. Then by comparing with the search performance while using the index weight calculated by applying a method of determining existing tag importance, it verifies effectiveness of the index weight calculated by applying the method proposed in this paper.
https://doi.org/10.3745/KIPSTB.2009.16-B.1.71 인용 PDF KSCI

Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
- Journal of KIISE
- /
- v.44 no.11
- /
- pp.1236-1243
- /
- 2017
In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.
https://doi.org/10.5626/JOK.2017.44.11.1236 인용 KSCI