Browse > Article

Web Navigation Mining by Integrating Web Usage Data and Hyperlink Structures  

Gu Heummo (LG전자 정보통신사업본부)
Choi Joongmin (한양대학교 컴퓨터공학과)
Abstract
Web navigation mining is a method of discovering Web navigation patterns by analyzing the Web access log data. However, it is admitted that the log data contains noisy information that leads to the incorrect recognition of user navigation path on the Web's hyperlink structure. As a result, previous Web navigation mining systems that exploited solely the log data have not shown good performance in discovering correct Web navigation patterns efficiently, mainly due to the complex pre-processing procedure. To resolve this problem, this paper proposes a technique of amalgamating the Web's hyperlink structure information with the Web access log data to discover navigation patterns correctly and efficiently. Our implemented Web navigation mining system called SPMiner produces a WebTree from the hyperlink structure of a Web site that is used trl eliminate the possible noises in the Web log data caused by the user's abnormal navigational activities. SPMiner remarkably reduces the pre-processing overhead by using the structure of the Web, and as a result, it could analyze the user's search patterns efficiently.
Keywords
Web navigation mining; hyperlink structure; WebTree; SPMiner;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Nanopoulos, Y. Manolopoulos, 'Finding generalized path patterns for Web log data mining,' Lecture Notes in Computer Science, vol. 1884, pp. 215-228, 2000   DOI
2 J. Srivastava, R. Cooley, M. Deshpande and Tan, P.-N. 'Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data,' SIGKDD Explorations, Vol. 1(2), pp.12-23, January 2000   DOI
3 J. Borges, M. Levene, 'A fine grained heuristic to capture Web navigation patterns,' ACM SIGKDD Explorations, vol. 2, issue 1, pp. 40-50, 2000   DOI
4 R. Agrawal, A. Srikant, 'Fast algorithms for mining association rules,' Proc. VLDB'94, pp. 487-499, 1994
5 J. Han, J. Pei, Y. Yin, 'Mining frequent patterns without candidate generation,' Proc. ACM SIGMOD, pp. 1-12, 2000   DOI
6 A. Srikant, R. Agrawal, 'Mining sequential patterns: Generalizations and performance improvements,' Proc. 5th Int. Conf. on Extending Database Technology, 1996
7 M. Zaki, 'SPADE: An efficient algorithm for mining frequent sequences,' Machine Learning, vol.42, no. 1-2, 2001   DOI
8 M. Chen, J. Park, P. Yu, 'Data mining for path traversal patterns in a Web environment,' Proc. 16th lnt. Conf. on Distributed Computing Systems, pp. 385-392, 1996   DOI
9 O. Etzioni, 'The World Wide Web: Quagmire or gold mine,' Comm. of the ACM, vo1.39, no.11, pp. 65-68, 1996   DOI   ScienceOn
10 R. Agrawal, T. Imielinski and A. Swami, 'Database Mining : A Performance Perspective,' IEEE Trans. On Knowledge and Data Engineering, Vol.5, No.6, pp.914-925, 1993   DOI   ScienceOn
11 M. Spiliopoulou. 'Web usage mining for Web site evaluation,' Comm. of the ACM, vol.43, no.8, pp. 127-134, 2000   DOI
12 S. Madria, S. Bhowmick, W. Ng, E. Lim, 'Research issues in Web data mining,' Proc. 1st Int. Conf. on Data Warehousing and Knowledge Discovery, pp. 303-312, 1999
13 M. Spiliopoulou, 'Data mining for the Web,' Proc. 3rd European Conf. on Principles of Data Mining and Knowledge Discovery, pp. 588-589, 1999
14 R. Kosala, H. Blockeel, 'Web mining research: A survey,' ACM SIGKDD Explorations, vol. 2, issue 1, pp. 1 -15, 2000   DOI