웹 검색 환경에서 범주의 동적인 분류

Dynamic Classification of Categories in Web Search Environment

  • 최범기 (인하대학교 컴퓨터공학과) ;
  • 이주홍 (인하대학교 컴퓨터공학과) ;
  • 박선 (인하대학교 컴퓨터공학과)
  • 발행 : 2006.07.01

초록

분류검색 방법은 색인검색 방법과 함께 중요한 요소로서 웹 검색 엔진에서 지원되고 있다. 사용자가 분류나 색인검색 방법 중 하나를 이용하여 원하는 검색결과를 찾지 못하면 다른 검색방법을 이용하여 찾을 수 있도록 대부분의 검색엔진에서는 두 가지 방법 모두 지원하고 있다. 색인검색 방법에서는 검색결과의 재현율이 높지만 검색결과가 너무 많이 나오기 때문에 원하는 검색결과를 찾아내는 것이 어렵다는 단점이 있다. 분류검색 방법은 찾고자 하는 문서의 해당 분류가 애매모호하거나 명확하게 알지 못할 때에는 문서를 찾지 못하는 경우가 빈번히 발생한다. 즉, 검색결과의 정확도는 높으나 재현율이 떨어지는 단점이 있다. 본 논문은 이러한 문제점을 해결하기 위해서 분류와 검색어간의 관계를 퍼지논리를 이용하여 정량적으로 계산하고 이를 바탕으로 범주간의 함의관계를 유도함으로써 동적인 범주체계를 구성하는 새로운 방법을 제시한다. 이 방법의 장점은 범주간의 합의관계를 유사한 하위범주로 간주함으로써 분류검색 결과의 재현율을 높일 수 있다는 것이다.

Directory searching and index searching methods are two main methods in web search engines. Both of the methods are applied to most of the well-known Internet search engines, which enable users to choose the other method if they are not satisfied with results shown by one method. That is, Index searching tends to come up with too many search results, while directory searching has a difficulty in selecting proper categories, frequently mislead to false ones. In this paper, we propose a novel method in which a category hierarchy is dynamically constructed. To do this, a category is regarded as a fuzzy set which includes keywords. Similarly extensible subcategories of a category can be found using fuzzy relational products. The merit of this method is to enhance the recall rate of directory search by expanding subcategories on the basis of similarity.

키워드

참고문헌

  1. R. Baeza-Yates and B. Ribeiro-Neto., 'Modern Information Retrieval. Addison Wesley,' 1999
  2. W. Bandler and L. Kohout., 'Fuzzy Power Sets and Fuzzy Implication Operations,' Fuzzy Set and Systems, Vol.4, No.1, pp. 13-30, 1980 https://doi.org/10.1016/0165-0114(80)90060-3
  3. W. Bandler and L. Kohout., 'Semantics of Implication Operators and Fuzzy Relational Products,' International Journal of Man-Machine Studies, Vol. 12, pp.89-116, 1980 https://doi.org/10.1016/S0020-7373(80)80055-1
  4. S. Chen and Y. Horng., 'Fuzzy Query Processing for Document Retrieval Based on Extended Fuzzy Concept Networks,' IEEE Transaction on Systems, Man and Cybernetics, Part B, Vol. 29, Issue 1, pp.96-104, 1999 https://doi.org/10.1109/3477.740169
  5. L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin., 'Placing Search in Context : The Concept Revisited,' In Proceedings of the 10th International Conference on World Wide Web, pp.406-414, HongKong, Chine, May 2001 https://doi.org/10.1145/371920.372094
  6. E. J. Glover, S. Lawrence, W. P. Brimingham, and C. L. Giles., 'Architecture of a Metasearch Engine that Support User Information Needs,' In Proceedings of the Sth International Conference on Information and Knowledge Management, pp. 210-216, Kansas City, Missouri, 1999 https://doi.org/10.1145/319950.319980
  7. K. Kim and S. Cho., 'A Personalized Web Search Engine Using Fuzzy Concept Network with Link Structure,' IFSA, pp.81-88, July 2001 https://doi.org/10.1109/NAFIPS.2001.944231
  8. K.H. Lee and G.L. Oh., 'Fuzzy Theory and Application Volume I : Theory,' HongReung Science Publishing Co., 1991
  9. M. NikRavesh., 'Fuzzy Conceptual-Based Search Engine using Conceptual Semantic Indexing,' NAFIPS-FLINT 2002, pp.l46-151, New Orleans, LA, June 2002 https://doi.org/10.1109/NAFIPS.2002.1018045
  10. T. Takagi and M. Tajima., 'Query Expansion Using Conceptual Fuzzy Sets for Search Engine,' In Proceedings of the 10th IEEE International Conference on Fuzzy Systems, pp. 1303-1308, 2001 https://doi.org/10.1109/FUZZ.2001.1008898
  11. J. Wen, J. Nie, and H. Zhang., 'Clustering User Queries of a Search Engine,' In Proceedings of the 10th International Conference on World Wide Web, pp. 162-168, Hong Kong, China, 2001 https://doi.org/10.1145/371920.371974
  12. O. Zamir and O. Etzioni., 'Grouper : A Dynamic Clustering Interface to Web Search Results,' In Proceedings of the 8th International Conference on World Wide Web, Toronto, Canada, 1999
  13. S. Pandey, S. Roy, C. Olston,. 'Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results,' In Proceedings of the 31st VLDB Conference, pp. 781-792, Trondheim, Norway, 2005
  14. Q. Peng, W. Meng, H. He, C. Yu., 'WISE-Cluster: Clustering E-Commerce Search Engines Automatically,' In Proceedings of the 6th annual ACM international workshop on web information and data management, pp. 104-111, Washington DC, USA, 2004 https://doi.org/10.1145/1031453.1031473
  15. L. Si and J. Callan,. 'Modeling Search Engine Effectiveness for Federated Search,' In Proceedings of the 28th SIGIR Conference, pp, 83-90, Salvador, Brazil, 2005 https://doi.org/10.1145/1076034.1076051