Development of A Web Mining System Based On Document Similarity

문서 유사도 기반의 웹 마이닝 시스템 개발

  • 이강찬 (한국전자통신연구원 정보화기술연구소 표준연구센터) ;
  • 민재홍 (한국전자통신연구원 정보화기술연구소 표준연구센) ;
  • 박기식 (한국전자통신연구원 정보화기술연구소 표준연구센) ;
  • 임동순 (한남대학교 공과대학 산업공학) ;
  • 우훈식 (대전대학교 공과대학 컴퓨터정보통신공학부)
  • Published : 2002.04.01

Abstract

In this study, we proposed design issues and structure of a web mining system and develop a system for the purpose of knowledge integration under world wide web environments resulted from our developing experiences. The developed system consists of three main functions: 1) gathering documents utilizing a search agent; 2) determining similarity coefficients between any two documents from term frequencies; 3) clustering documents based on similarity coefficients. It is believed that the developed system can be utilized for discovery of knowledge in relatively narrow domains such as news classification, index term generation in knowledge management.

Keywords