Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2005.12B.1.057

Passage Retrieval and Calculation Method of Topic Field by Using Field-Associated Terms  

Lee Samuel-Sangkon (전주대학교 정보기술공학부)
Abstract
It is important to segment a text, which is independent upon any text-embedded auxiliary information. This paper presents a technique for dividing the text into field-coherent passages. The presented method is based upon extracting field-associated terms from the text measuring how the topics grow, shrink and shift from sentence to sentence. We propose measures of topic continuity and of topic transition and suggest how those could be used to find the boundaries among passages. After collecting 12,500 documents, we obtain $88{\%}$ for average precision and $78{\%}$ for recall in Korean training set.
Keywords
Field-Associated Term; Tracing Topic Field; Calculation Method for Topic Field; Passage Retrieval;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Sangkon Lee, Masafumi Koyama, Shoji Mizobuchi, Kyoko Uchibayashi, Fumihiko Kawano, Takahiro Komatsu, and Jun-ichi Aoe, 'Cross-language Multi-media Information Retrieval System: BOSS,' Proceedings of the Eighteenth International Conference on Computer Processing of Oriental Languages, Vol.1, pp.245-248, 1999
2 Samuel Sangkon Lee, Masami Shishibori, Toru Sumitomo, and Jun-Ichi Aoe, 'Extraction of Field-coherent Passages,' Information Processing & Management, Vol.38, No.2, pp. 173-207, 2002   DOI   ScienceOn
3 Sangkon Lee and Masami Shishibori, 'Passage Segmentation based on Topic Matter,' International Journal of Computer Processing of Oriental Languages, Vol.15, No.3, pp.305- 339, 2002   DOI
4 Mochizuki, H., Makoto, I., and Okumura, M., Passage-Level Document Retrieval Using Lexical Chains. Journal of Natural Language Processing. Vol.6, No.3, pp.101-126, 1999. (in Japanese)   DOI
5 Myaeng, S. H, Jang, D. H, Kim, M. S., and Zhoo, Z. C, A Flexible Model for Retrieval of SGML Documents. The Proceedings of the 21st Annual International ACM Special Interest Group Information Retrieval Conference on Research and Development in Information Retrieval(SIGIR 1998), 1998   DOI
6 Knaus, D., Mittendorf, E., and Schauble, P., Improving a Basic Retrieval Method by Links and Passage Level Evidence. The Third Text Retrieval Conference (TREC-3), 1994
7 Kretser, O., and Moffat, A., Efficient Document Presentation with a Locality-Based Similarity Heuristic. The Proceedings of the 22nd Annual International ACM Special Interest Group on Information Retrieval Conference on Research and Development in Information Retrieval(SIGIR 1999), 1999   DOI
8 Cormack, G. V., Clarke, C. L. A, Palmer, C. R, and To, S. S. L., Passage-based Query Refinement (MultiText Experiments for TREC-6). An International Journal of Information Processing and Management. Vol.36, No.1, pp.133-153, 2000   DOI   ScienceOn
9 Daniels, J. J., Retrieval of Passages for Information Reduction. Doctoral Thesis. University of Massachusetts Amherst, MA, USA, 1997
10 Kaszkiel, M., Zobel, J., and Sacks-Davis, R., Efficient Passage Ranking for Document Databases. ACM Transactions on Information Systems. Vol.17, No.4, pp.406-439, 1999   DOI   ScienceOn
11 Cormack, G. V., Clarke, C. L. A., Palmer, C. R., and Kisman, D. I. E., Fast Automatic Passage Ranking (Multi'Text Experiments for TREC-8). The Eighth Text Retrieval Conference(TREC-8), 1999
12 Cormack, G. V., Clarke, C. L. A., Palmer, C. R., and To, S. S. L., Passage-Based Refinement(MultiText Experiments for TREC-6). The Sixth Text Retrieval Conference(TREC-6), 1997
13 Aho, A. V., & Corasick, M. J. 'Efficient String Matching: An Aid to Bibliographic Search,' Communications of the ACM, Vol.18, No.6, pp.333 -340, 1975   DOI   ScienceOn
14 이상곤, '분야연상어를 이용한 화제의 계속성과 전환성을 추적하는 단락분할 방법', 정보처리학회논문지B, 제10권, 제1호, pp.57-66, 2003   과학기술학회마을   DOI   ScienceOn
15 이상곤, 이완권, '분야연상서의 수집과 추출 알고리즘', 정보처리학회논문지B, 제10권, 제3호, pp.347-358, 2003
16 자유국민사, 현대용어의 기초지식, 1997(in Japanese)
17 Williams, M., An Evaluation of Passage-Level Indexing Strategies for a Technical Report Archive. LIBRES : Library and Information Science Research Electronic Journal. Vol.8, No.1, pp.194-218, 1998
18 Yamamoto, K., Masuyarna, S., and Naito, S., Experimental Study on Paragraphing Japanese Sentences Using Cue Words. The Proceedings of the First Annual Meeting of the Association for Natural Language Processing. Vol.84-9, 1991(in Japanese)
19 Yang, K., Maglaughlin, K. L., and Newby, G. B., Passage Feedback with IRIS, An International Journal of Information Processing and Management. Vol.37, No.3, pp.521-541, 2001   DOI   ScienceOn
20 Zobel, J., Moffat, A., Wilkinson, R, and Sacks-Davis, R, Efficient Retrieval of Partial Documents. An International Journal of Information Processing and Management. Vol.31 , No.3, pp.361- 377, 1995   DOI   ScienceOn
21 Shoji Mizobuchi, Sangkon Lee, Fumihiko Kawano, Tsuyoshi Kobayashi, Takahiro Komatsu, and Jun-ichi Aoe, 'Multi-lingual Multi-media Information Retrieval System,' NTCIR Workshop I, Vol.1, pp.171-178, 1999
22 Wilkinson, R., Effective Retrieval of Structured Documents. The Proceeding of 17th Annual International ACM Special Interest Group on Information Retrieval Conference on Research and Development in Information Retrieval Research (SIGIR 1994), 1994
23 Sangkon Lee, Masami Shishibori, Kazuhiro Morita, and Jun-ichi Aoe, 'Passage Retrieval based on Topic-Matter,' The 19th International Conference on Computer Processing of Oriental Languages, Vol.1, pp.193-198, 2001