Browse > Article

Automatic Construction of Class Hierarchies and Named Entity Dictionaries using Korean Wikipedia  

Bae, Sang-Joon (동아대학교 컴퓨터공학과)
Ko, Young-Joong (동아대학교 컴퓨터공학과)
Abstract
Wikipedia as an open encyclopedia contains immense human knowledge written by thousands of volunteer editors and its reliability is also high. In this paper, we propose to automatically construct a Korean named entity dictionary using the several features of the Wikipedia. Firstly, we generate class hierarchies using the class information from each article of Wikipedia. Secondly, the titles of each article are mapped to our class hierarchies, and then we calculate the entropy value of the root node in each class hierarchy. Finally, we construct named entity dictionary with high performance by removing the class hierarchies which have a higher entropy value than threshold. Our experiment results achieved overall F1-measure of 81.12% (precision : 83.94%, recall : 78.48%).
Keywords
Wikipedia; Class hierarchy; Named entity dictionary; Text mining;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. Lee, J. Lee, M. Chol, G. Kim, "Study on Named Entity Recognition in Korean Text," Proc. of the Annual Conference on Human Cognitive Language Technology, vol.21, no.1(C), pp.292-299, 2000. (in Korean)
2 E. Riloff And R. Jones, "Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping," Proc. of the Sixteenth National Conference on Artificial Intelligence, pp.474-479, 1999.
3 E. Agichtein And L. Gravano, "Snowball: extracting relations from large plain-text collections," Comm. ACM, pp.85-94, 2000.
4 M. Thelen And E. Riloff, "A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts," Proc. of the Conference on EMNLP, pp.214-221, 2002.
5 W. Dakka And S. Cucerzan, "Augmenting Wikipedia with Named Entity Tags," Proc. of the IJCNLP, pp.545-552, 2008.
6 S. Ye , T. Seng, J. Iu, "Summarizing Definition from Wikipedia," Proc. of the ACL-IJCNLP, pp.199-207, 2009.
7 A. Richman And P. Schone, "Mining Wiki Resources for Multilingual Named Entity Recognition," Proc. of the ACL, pp.1-9, 2008.
8 A. L. Berger, S. A. Della Pietra, S. A. Della Pietra, "A Maximum Entropy Approach to Natural Language Processing," Proc. of the Computational Linguistics, pp.39-71, 1996.