Browse > Article
http://dx.doi.org/10.7840/kics.2013.38B.4.291

An Automatic Web Page Classification System Using Meta-Tag  

Kim, Sang-Il (광운대학교 전자통신공학과)
Kim, Hwa-Sung (광운대학교 전자통신공학과)
Abstract
Recently, the amount of web pages, which include various information, has been drastically increased according to the explosive increase of WWW usage. Therefore, the need for web page classification arose in order to make it easier to access web pages and to make it possible to search the web pages through the grouping. Web page classification means the classification of various web pages that are scattered on the web according to the similarity of documents or the keywords contained in the documents. Web page classification method can be applied to various areas such as web page searching, group searching and e-mail filtering. However, it is impossible to handle the tremendous amount of web pages on the web by using the manual classification. Also, the automatic web page classification has the accuracy problem in that it fails to distinguish the different web pages written in different forms without classification errors. In this paper, we propose the automatic web page classification system using meta-tag that can be obtained from the web pages in order to solve the inaccurate web page retrieval problem.
Keywords
meta-tag; automatic classification; web page classification; weka naive bayes;
Citations & Related Records
연도 인용수 순위
  • Reference
1 G. Xu, C. Xiang, X. Zhao, and G. Yang, "Tibetan web page classification based on column navigator", in Proc. 2012 2nd Int. Conf. Intell. Syst. Design Eng. Applicat. (ISDEA), pp. 610-612, Hainan, China, Jan. 2012.
2 dmoz web pages [open directory project], from http://www.dmoz.org.
3 J. D. M. Rennie and D. R. Karger, "Tackling the poor assumptions of naive bayes textclassifiers", in Proc. 20th Int. Conf. Mach. Learning, pp. 616-623, Washington DC, U.S.A., Aug. 2003
4 J.-U. Kim, H.-J. Kim, and S.-G. Lee, "A study on incremental learning model for naive bayes text classifier," in Proc. Int. Conf. Korea Database Soc., pp. 331-341, Seoul, Korea, June 2001.   과학기술학회마을
5 X. Qi and B. D. Davison. "Web page classification: features and algorithms," J. ACM Computing Surveys, vol. 41, no. 2, Article No. 12, Feb. 2009.
6 D. Shen, J.-T. Sun, Q. Yang, and Z. Chen, "A comparison of implicit and explicit links for web page classification", in Proc. 15th Int. Conf. World Wide Web (WWW 2006), pp. 643-650, Edinburgh, U.K., May 2006.
7 I. Charalampopoulos, "A comparable study employing WEKA clustering/classification algorithms for web page classification", in Proc. 15th Panhellenic Conf. Inform. (PCI), pp. 235-239, Kastoria, Greece, Oct. 2011.
8 weka web page, Weka 3: Data Mining Sofrware in Java [Online], from http://www.cs.waikato.ac.nz/ml/weka/.
9 I. H. Witten and F. Eibe, Data Mining: Practical Machine Learning Tools and Techniques, 2nd Ed., Morgan Kaufmann, 2000
10 Toby Segaran, Programming collective intelligence, O'Reilly Media, 2007
11 X. Qi and B. D. Davison, "Classifiers without borders: incorporating fielded text from neighboring web pages," in Proc. 31st Annu. Int. ACM SIGIR Conf., pp. 643-650, Singapore, July 2008.