DOI QR코드

DOI QR Code

A Global-Interdependence Pairwise Approach to Entity Linking Using RDF Knowledge Graph

개체 링킹을 위한 RDF 지식그래프 기반의 포괄적 상호의존성 짝 연결 접근법

  • 심용선 (서울대학교 치의과학과) ;
  • 양성권 (서울대학교 의료정보학과) ;
  • 김홍기 (서울대학교 치의과학과)
  • Received : 2018.11.29
  • Accepted : 2019.01.04
  • Published : 2019.03.31

Abstract

There are a variety of entities in natural language such as people, organizations, places, and products. These entities can have many various meanings. The ambiguity of entity is a very challenging task in the field of natural language processing. Entity Linking(EL) is the task of linking the entity in the text to the appropriate entity in the knowledge base. Pairwise based approach, which is a representative method for solving the EL, is a method of solving the EL by using the association between two entities in a sentence. This method considers only the interdependence between entities appearing in the same sentence, and thus has a limitation of global interdependence. In this paper, we developed an Entity2vec model that uses Word2vec based on knowledge base of RDF type in order to solve the EL. And we applied the algorithms using the generated model and ranked each entity. In this paper, to overcome the limitations of a pairwise approach, we devised a pairwise approach based on comprehensive interdependency and compared it.

자연어 표현에는 인물, 조직, 장소, 제품 등의 다양한 개체들이 존재한다. 이러한 개체는 다양한 의미를 가질 수 있다. 이러한 개체가 갖는 중의성 문제는 자연어 처리 분야에 있어 매우 도전적인 과제이다. 개체 링킹(Entity Linking)이란 텍스트에 등장한 개체명을 지식베이스 내의 적절한 개체로 연결해주는 작업이다. 개체 링킹을 위한 대표적인 방법론인 짝 연결 접근법(Pairwise based method)은 한 문장에서 등장한 개체가 두 개 이상일 경우 서로의 연관성을 이용해 개체 링킹을 하는 방법이다. 이 방법은 동일 문장에서 등장하는 개체들 간의 상호의존성(interdependence)만을 고려하고 있어 포괄적인 상호의존성(Global interdependence)이 부족하다는 한계를 갖고 있다. 본 논문에서는 개체 링킹을 위해 RDF 형태의 지식베이스 정보를 바탕으로 Word2vec을 활용한 Entity2vec 모델을 생성하였다. 그리고 생성된 모델을 사용하여 각 개체에 대한 랭킹을 하였다. 본 논문에서는 짝 연결 접근법의 한계점을 보완하기 위해 포괄적인 상호의존성을 바탕으로 짝 연결 접근법을 고안하고 구현 및 실험을 통해 기존의 짝 연결 접근법과 비교하였다.

Keywords

JBCRJM_2019_v8n3_129_f0001.png 이미지

Fig. 1. Overview of Entity Linking Using Entity2vec Model

JBCRJM_2019_v8n3_129_f0002.png 이미지

Fig. 2. Example of Applying GIPW Algorithm

JBCRJM_2019_v8n3_129_f0003.png 이미지

Fig. 3. Example of Applying IPW Algorithm

JBCRJM_2019_v8n3_129_f0004.png 이미지

Fig. 4. Example of Applying ETP Algorithm

Table 1. Examples of Adam Knowledgebase

JBCRJM_2019_v8n3_129_t0001.png 이미지

Table 2. Examples of Answer Set

JBCRJM_2019_v8n3_129_t0002.png 이미지

Table 3. Test Results of IPW, GIPW, PPR

JBCRJM_2019_v8n3_129_t0003.png 이미지

Table 4. Test Results of a Change of Iteration

JBCRJM_2019_v8n3_129_t0004.png 이미지

Table 5. Test Results of a Change of Epoch

JBCRJM_2019_v8n3_129_t0005.png 이미지

Table 6. Test Results of a Change of Layersize

JBCRJM_2019_v8n3_129_t0006.png 이미지

References

  1. Kulkarni, S., Singh, A., Ramakrishnan, G., and Chakrabarti, S., "Collective annotation of Wikipedia entities in web text," in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp.457-466, June, 2009.
  2. Mikolov, T., Chen, K., Corrado, G., and Dean, J. "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781., 2013.
  3. Page, L., Brin, S., Motwani, R., and Winograd, T., "The PageRank citation ranking: Bringing order to the web," Stanford InfoLab, 1999.
  4. Bunescu, R. and Pasca, M., "Using encyclopedic knowledge for named entity disambiguation," in 11th conference of the European Chapter of the Association for Computational Linguistics, 2006.
  5. Cucerzan, S., Large-scale named entity disambiguation based on Wikipedia data. in Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), 2007.
  6. Dredze, M., McNamee, P., Rao, D., Gerber, A., and Finin, T., "Entity disambiguation for knowledge base population," in Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, pp.277-285, Aug. 2010.
  7. Fader, A., Soderland, S., Etzioni, O., and Center, T., "Scaling Wikipedia-based named entity disambiguation to arbitrary web text," in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, pp.21-26, Jan. 2009.
  8. Milne, D. and Witten, I. H., "Learning to link with wikipedia," in Proceedings of the 17th ACM conference on Information and knowledge management, ACM, pp.509-518, Oct. 2008.
  9. Medelyan, O., Witten, I. H., and Milne, D., "Topic indexing with Wikipedia," in Proceedings of the AAAI WikiAI Workshop, Vol.1, pp.19-24, Jul. 2008.
  10. SeoHyun Kim, YoungDuk Seo, and Doo-Kwon Baik, "Tweet Entity Linking Method based on User Similarity for Entity Disambiguation," Journal of KIISE, Vol.43, No.9, pp.1043-1051, 2016. https://doi.org/10.5626/JOK.2016.43.9.1043
  11. Zwicklbauer, S., Seifert, C., and Granitzer, M, "DoSeR-a knowledge-base-agnostic framework for entity disambiguation using semantic embeddings," in International Semantic Web Conference, Springer, Cham, pp.182-198, May 2016.
  12. Wousung Won, Jongseong Woo, Jiseong Kim, YoungGyun Hahm, and Key-Sun Choi, "Linking Korean Predicates to Knowledge Base Properties," Journal of KIISE, Vol.42, No.12, pp.1568-1574, 2015. https://doi.org/10.5626/JOK.2015.42.12.1568
  13. Yamada, I., Shindo, H., Takeda, H., and Takefuji, Y., "Joint learning of the embedding of words and entities for named entity disambiguation," arXiv preprint arXiv:1601.01343., 2016.
  14. Ganea, O. E. and Hofmann, T. "Deep joint entity disambiguation with local neural attention," arXiv preprint arXiv:1704.04920., 2017.
  15. Hokyung Lee., Jaehyuun An., Jeongmin Yoon., Kyoungman Bae., and Youngjoong Ko., "A Method to Solve the Entity Linking Ambiguity and NIL Entity Recognition for efficient Entity Linking based on Wikipedia," Journal of KIISE, Vol.44, No.8, pp.813-821, 2017. https://doi.org/10.5626/JOK.2017.44.8.813
  16. In-Su Kang, "An Effect of Semantic Relatedness on Entity Disambiguation: Using Korean Wikipedia," Journal of Korean Institute of Intelligent Systems. Vol.25, No.2, pp.111-118, 2015. https://doi.org/10.5391/JKIIS.2015.25.2.111
  17. Miller, E. "An introduction to the resource description framework," Bulletin of the American Society for Information Science and Technology, Vol.25, No.1, pp.15-19, 1998. https://doi.org/10.1002/bult.126
  18. Saltlux's Adam Platform [internet], http://adams.ai/.
  19. Deep Learning for Java [internet], https://deeplearning4j.org/
  20. Dubey, M., Banerjee, D., Chaudhuri, D., and Lehmann, J., "EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs," arXiv preprint arXiv:1801.03825., 2018.
  21. Goyal, P. and Ferrara, E. "Graph embedding techniques, applications, and performance: A survey," Knowledge-Based Systems, Vol.151, pp.78-94, 2018. https://doi.org/10.1016/j.knosys.2018.03.022