Browse > Article
http://dx.doi.org/10.5391/JKIIS.2013.23.3.191

English-Korean Cross-lingual Link Discovery Using Link Probability and Named Entity Recognition  

Kang, Shin-Jae (School of Computer and Information Technology, Daegu University)
Publication Information
Journal of the Korean Institute of Intelligent Systems / v.23, no.3, 2013 , pp. 191-195 More about this Journal
Abstract
This paper proposes an automatic method for discovering cross-lingual links from English Wikipedia documents to Korean ones in order to increase connectivity among vast web resources. Compared to the existing methods roughly estimating link probability of phrases, candidate anchors are selected from English documents by using various information such as title lists and linking probability extracted from Wikipedia dumps and the results of named-entity recognition, and the anchors are translated into Korean words, and then the most suitable Korean documents with the words are selected as cross-lingual links. The experimental results showed 0.375 of MAP.
Keywords
Cross-lingual link discovery; Link identification; Link probability; Wikipedia;
Citations & Related Records
연도 인용수 순위
  • Reference
1 CrossLink,Evaluation, http://crosslink.googlecode.com/files/CrosslinkEvaluation-Training-20110715.zip
2 R. Mihalcea, and A. Csomai, "Wikify! Linking Documents to Encyclopedic Knowledge", In Proceedings of the CIKM'07, pp.233-242, November, 2007.
3 L. X. Tang, D. Cavanagh, A. Trotman, S. Geva, Y. Xu, and L. Sitbon, "Automated Cross-lingual Link Discovery in Wikipedia", In Proceedings of the 9th NTCIR Workshop Meeting, pp.512-519, December 2011.
4 J. Kim and I. Gurevych, "UKP at CrossLink: Anchor Text Translation for Cross-lingual Link Discovery", In Proceedings of the 9th NTCIR Workshop Meeting, pp.487-494, December 2011.
5 CrossLingual Link Discovery Task, http://ntcir.nii.ac.jp/CrossLink/
6 I. S. Kang, and R. Marigomen, "English-to-Korean Cross-linking of Wikipedia Articles at KSLP", In Proceedings of the 9th NTCIR Workshop Meeting, pp.481-483, December 2011.
7 E. Adar, M. Skinner, and D. S. Weld, "Information Arbitrage Across Multi-lingual Wikipedia", In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, pp.94-103, February 2009.
8 NTCIR-9 Home, http://research.nii.ac.jp/ntcir/ntcir-9/
9 Natural Language Toolkit, http://nltk.org/
10 S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python, O'reilly, pp.281-284, 2009.