Browse > Article
http://dx.doi.org/10.7232/iems.2012.11.1.082

Pre-Processing of Query Logs in Web Usage Mining  

Abdullah, Norhaiza Ya (Malaysia Institute of Information Technology (MIIT), Universiti Kuala Lumpur)
Husin, Husna Sarirah (Malaysia Institute of Information Technology (MIIT), Universiti Kuala Lumpur)
Ramadhani, Herny (Malaysia Institute of Information Technology (MIIT), Universiti Kuala Lumpur)
Nadarajan, Shanmuga Vivekanada (Malaysia Institute of Information Technology (MIIT), Universiti Kuala Lumpur)
Publication Information
Industrial Engineering and Management Systems / v.11, no.1, 2012 , pp. 82-86 More about this Journal
Abstract
In For the past few years, query log data has been collected to find user's behavior in using the site. Many researches have studied on the usage of query logs to extract user's preference, recommend personalization, improve caching and pre-fetching of Web objects, build better adaptive user interfaces, and also to improve Web search for a search engine application. A query log contain data such as the client's IP address, time and date of request, the resources or page requested, status of request HTTP method used and the type of browser and operating system. A query log can offer valuable insight into web site usage. A proper compilation and interpretation of query log can provide a baseline of statistics that indicate the usage levels of website and can be used as tool to assist decision making in management activities. In this paper we want to discuss on the tasks performed of query logs in pre-processing of web usage mining. We will use query logs from an online newspaper company. The query logs will undergo pre-processing stage, in which the clickstream data is cleaned and partitioned into a set of user interactions which will represent the activities of each user during their visits to the site. The query logs will undergo essential task in pre-processing which are data cleaning and user identification.
Keywords
Pre-Processing; Web Log; Web Usage Mining;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Mobasher, B., Dai, H., Luo, T., Sun, Y., and Zhu, J. (2000), Integrating web usage and content mining for more effective personalization, Proceedings of the First International Conference on Electronic Commerce and Web Technologies, LNCS, 1875, 165-176.
2 Murgue, T. and Jaillon, P. (2005), Data Preparation and Structural Models for Web Usage Mining, SETIT International Conference: Sciences of Electronic, Technologies of Information and Telecommunication.
3 Nicholas, D., Huntington, P., Williams, P., and Dobrowolski, T. (2004), Reappraising information seeking behavior in a digital environment, Documentation, 60(1), 24-43.   DOI   ScienceOn
4 Pitkow, J. (1997), In search of reliable usage data on the WWW, Sixth International World Wide Web Conference, 451-463.
5 Srivastava, J., Cooley, R., Deshpande, M., and Tan, P. N. (2000), Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, ACM SIGKDD, 1(2), 12-23.   DOI
6 Sanjay, B. and Thakare, S. (2010), A effective and complete preprocessing for Web Usage Mining, IJCSE International Journal on Computer Science and Engineering, 2(3), 848-851.
7 Status codes (2011), Available at http://www.w3.org/Protocols/HTTP/HTRESP.html.
8 Tanasa, D. and Trousse, B. (2004), Advanced Data Preprocessing for Intersites Web Usage Mining. IEEE Intelligent Systems, 19(2), 59-65.   DOI   ScienceOn
9 Tyagi, N. K., Solanki, A. K., and Wadhwa, M. (2010), Analysis of Server Log by Web Usage Mining for Website Improvement, International Journal of Computer Science Issues, 7(4-8), 17-21.
10 Batista, P., Silva, M. J., Silva, M., and Grande, C. (2002), Mining On-line Newspaper Web Access Logs, Proceedings of the AH'2002 Workshop on Recommendation and Personalization in eCommerce, 100-108.
11 Choa, Y. H., Kim, J. K., and Kima, S. H. (2002), A personalized recommender system based on web usage mining and decision tree induction, Expert Systems with Applications, 23, 329-342.   DOI   ScienceOn
12 Cooley, R., Mobasher, B., and Srivastava, J. (1999), Data Preparation for Mining World Wide Web Browsing Patterns, Knowledge and Information Systems, 1(1), 5-32.   DOI   ScienceOn
13 Dixit, D. and Gadge, J. (2010), Automatic Recommendation for Online Users Using Web Usage Mining, International Journal of Managing Information Technology (IJMIT), 2, 33-42.   DOI
14 Elsheikh, S. (2008), Web Usage Data for Web Access Control (WUDWAC), Proceedings of the World Congress on Engineering.
15 Hao, T., Brimmer, D. J., Lin, J. M. S., Tumpey, A. J. and Reeves, W. C. (2009), Web Usage Data as a Means of Evaluating Public Health Messaging and Outreach, Journal of Medical Internet Research, 11, 99-118.
16 Vellingiri, J. S. And Pandian, C. (2011), A Survey on Web Usage Mining, Global Journal Of Computer Science and Technology, 1, 4343-4350.
17 Kumari, V. V. and Raju, K. S. (2010), Understanding User Behavior using Web Usage Mining, International Journal of Computer Applications, 7, 162-286.
18 Markellou, P., Rigou, M., and Sirmakessis, S. (2005), Mining for Web Personalization, in Scime, A. (Ed.) Web Mining: Applications and Techniques, London: Idea Group Publishing, 27-48.