Design and Implementation of an Object-Based Thesaurus System: Semi-automated Construction, Abstracted Concept Browsing and Query-Based Reference

객체기반 시소러스 시스템의 설계 및 구현: 반자동화 방식의 구축, 추상화 방식의 개념 브라우징 및 질의기반 참조

  • 최재훈 (전북대학교 컴퓨터과학과) ;
  • 김기헌 (전북대학교 컴퓨터과학과) ;
  • 양재동 (전북대학교 컴퓨터과학과)
  • Published : 2000.03.31

Abstract

In this paper, we design and implement a system for managing domain specific thesauri, where object-oriented paradigm is applied to thesaurus construction, concept browsing and query-based reference. This system provides an objected-oriented mechanism to assist domain experts in constructing thesauri; it determines a considerable part of relationship degrees between terms by inheritance and supplies domain experts with information available from a thesaurus being constructed This information is especially useful to enforce consistency between the hierarchies of a thesaurus, each constructed by different experts in different sites through cooperation. It may minimize the burden of domain eIn this paper, we design and implement a system for managing domain specific thesauri, where object oriented paradigm is applied to thesaurus construction, concept browsing and query based reference. This system provides an objected mechanism to assist domain experts in constructing thesauri: it determines a considerable part of relationship degrees between terms by inheritance and supplies domain experts with information available from a thesaurus being constructed. This information is especially useful to enforce consistency between the hierarchies of a thesaurus, each constructed by different experts in different sites through cooperation. It may minimize the burden of domain experts caused from the exhaustive specification of individual relationship. This system also provides an abstracted browsing and a query based reference, which allow users to easily verify thesaurus terms before they are used in usual boolean queries. The verification is made by actively searching for them in the thesaurus. Reference queries and abstracted browsing views facilitate this searching. The facility is indispensable especially when precision counts for much.

본 논문에서는 객체지향 패러다임을 적용함으로써 도메인 종속적인 시소러스를 효율적으로 구축하고 관리할 수 있는 객체기반 시소러스 시스템을 설계하고 구현하였다. 이때, 객체지향 패러다임은 시소러스의 구축, 개념 브라우징 그리고 질의기반 참조 기능에 적용된다. 이 시스템에서 객체지향 패러다임의 상속 메커니즘은 시소러스에 표현된 개념들간의 관계를 구조적으로 파악할 수 있게 하여 전문가가 시소러스를 반자동 방식으로 구축할 수 있도록 지원한다. 특히, 방대한 시소러스를 여러 전문가들이 서로 다른 호스트에서 구축할 경우, 이 메커니즘에 의해 파악된 정보는 시소러스의 의미적 일관성을 유지시킬 수 있도록 도와주며, 전문가가 직접 개념들간의 관련 정도를 모두 명시해야하는 부담을 최소화할 수 있다. 객체기반 시소러스 시스템은 또한 질의기반 참조 기능과 추상화 방식의 개념 브라우징 기능을 제공한다. 이 기능들은 검색 질의에 이용될 시소러스 개념들을 사용자가 사전에 탐색해 봄으로써 쉽게 검증할 수 있게 한다. 특히, 이 질의 검증 과정은 높은 정확률을 요구하는 도메인에 적절히 이용될 수 있다.

Keywords

References

  1. H. Chen, T. Yim and D. Fye, 'Automatic Thesaurus Generation for an Electronic Community System,' Journal of the American Society for Information Science, Vol. 46, No. 3, pp. 175-193, 1995 https://doi.org/10.1002/(SICI)1097-4571(199504)46:3<175::AID-ASI3>3.0.CO;2-U
  2. C. J. Crouch, 'An Approach to the Automatic Construction of Global thesaurus,' Information Processing and Management, Vol. 26, No. 5, pp. 629-640, 1990 https://doi.org/10.1016/0306-4573(90)90106-C
  3. J. Y. Nie and M. Brisebois, 'An Inferential Approach to Information Retrieval and its Implementation using a Manual Thesaurus,' Artificial Intelligence Review, Vol. 10, No. 5, pp. 409-439, 1996 https://doi.org/10.1007/BF00130693
  4. M. Hancock-Beaulieu, M. Fieldhouse and T. Do, 'An Evaluation of Interactive Query Expansion in an Online Library Catalogue with a Graphical User Interface,' Journal of Documentation, Vol. 5, No. 3, pp. 225-245, 1995
  5. M. Hancock-Beaulieu and S. Walker, 'An Evaluation of Automatic Query Expansion in an Online Library Catalogue,' Journal of Documentation, Vol. 48, No. 4, pp. 406-421, 1992
  6. J. Ganzmann, 'Criteria for the Evaluation of Thesaurus Software,' International Classification, Vol. 17, No. 3/4, pp. 148-157, 1990
  7. J. L. Milstead, 'Specifications for Thesaurus Software,' Information Processing and Management, Vol. 27, No.2/3, pp. 165-175, 1991 https://doi.org/10.1016/0306-4573(91)90047-P
  8. Y. Jing and W. B. Croft, 'An Association Thesaurus for Information Retrieval,' Proceedings of the RIAO 94, C.I.D., Paris, pp. 146-160, 1994
  9. Y. Qiu and H. P. Frei, 'Applying a Similarity Thesaurus to a Large Collection for Information Retrieval,' Technical Report. Dept. Computer Science, Swiss Federal Institute of Technology (ETH), Jan. 1995
  10. S. Jones, M. Gatford, S. Robertson, M. Hancock-Beaulieu, J. Secker and S. Walker, 'Interactive Thesaurus Navigation: Intelligence Rules OK?,' Journal of the American Society for Information Science, Vol. 1, No. 46, pp. 52-59, 1995 https://doi.org/10.1002/(SICI)1097-4571(199501)46:1<52::AID-ASI6>3.0.CO;2-1
  11. M. P. Smith, A. S. Pollitt and C. S. Li, 'Evaluation of Concept Translation through Menu Navigation in the MenUSE Intermediary System,' Proceedings of the 14th BCS IRSG Research Colloquium on Information Retrieval, University of Lancaster, pp. 38-54, April, 1992
  12. H. J. Peat and P. Willett, 'The Limitation of Term Co-occurrence Data for Query Expansion in Document Retrieval System,' Journal of the American Society for Information Science, Vol. 42, No. 5, pp. 378-383, 1991 https://doi.org/10.1002/(SICI)1097-4571(199106)42:5<378::AID-ASI8>3.0.CO;2-8
  13. R. Rada and B. K. Martin, 'Augmenting Thesauri for Information Systems,' ACM Transaction on Office Information System, Vol. 5, No.4, pp. 378-392, 1987 https://doi.org/10.1145/42196.42246
  14. R. Rada, H. Mili, E. Bickenell, and M. Blettner, 'Development and Application of a Metric on Semantic Nets,' IEEE Transaction on Systems, Man and Cybernetics, Vol. 19, No. 1, pp. 17-30, 1989 https://doi.org/10.1109/21.24528
  15. 최재훈, 한종진, 박종진, 양재동, '구조적인 시소러스 구축을 지원하는 객체기반 정보 검색 모델,' 한국정보과학회 논문지(B), Vol. 24, No. 11, pp. 1244-1256, 1997
  16. J. H. Lee, 'Thesaurus-based Document Ranking for Boolean Retrieval Systems,' KAIST, ph. D. Thesis, 1993
  17. G. Salton, Automatic Text Processing, Addison-Wesley, 1989
  18. H. L. Larsen and R. R. Yager, 'The Use of Fuzzy Relational Thesauri for Classificatory Problem Solving in Information Retrieval and Expert Systems,' IEEE Transactions on Systems, Man and Cybernetics, Vol. 23, No. 1, pp. 31-41, 1993 https://doi.org/10.1109/21.214765
  19. S. K. M. Wong, W. Ziarko, V. V. Raghavan and P. C. N. Wong, 'On Modeling of Information Retrieval Concepts in Vector Space,' ACM Transactions On Database Systems, Vol. 12, pp. 299-321, 1987 https://doi.org/10.1145/22952.22957
  20. S. K. M. Wong and Y. Y. Yao, 'A Generalized Binary Probabilistic Independence Model,' Journal of the American Society for Information Science, Vol. 41, No. 5, pp. 342-329, 1990 https://doi.org/10.1002/(SICI)1097-4571(199007)41:5<324::AID-ASI2>3.0.CO;2-9
  21. W. Kim, Introduction to Object-Oriented Databases, The MIT Press, 1990
  22. M. M. Gupta and J. Qi, 'Theory of t-norms and Fuzzy Inference Methods,' Fuzzy Sets and Systems, Vol. 40, No. 3, pp. 431-450, 1991 https://doi.org/10.1016/0165-0114(91)90171-L
  23. A. Doerr and K. Levasseur, 'Applied Discrete Structures for Computer Science,' Macmillan Publishing Company, 1989