DOI QR코드

DOI QR Code

한국어 서술어와 지식베이스 프로퍼티 연결

Linking Korean Predicates to Knowledge Base Properties

  • 원유성 (한국과학기술원 전산학부) ;
  • 우종성 (한국과학기술원 전산학부) ;
  • 김지성 (한국과학기술원 전산학부) ;
  • 함영균 (한국과학기술원 전산학부) ;
  • 최기선 (한국과학기술원 전산학부)
  • 투고 : 2015.07.27
  • 심사 : 2015.10.12
  • 발행 : 2015.12.15

초록

본 논문은 자연언어 문장을 지식베이스의 지식 골격에 맞추어 지식의 형태로 변환하기 위한 과정 중의 하나인 관계추출(Relation Extraction)을 목표로 한다. 특히, 문장 내에 있는 서술어(Predicate)에 집중하여 서술어와 관련성 높은 지식베이스 프로퍼티(Property or Relation)를 찾아내고, 이를 통해 두 개체(Entity)간의 의미를 파악하는 관계추출에 초점을 둔다. 이에 널리 활용되는 원격지도학습(Distant Supervision) 접근 방식에 따라, 지식베이스와 자연언어 텍스트로부터 원격 학습이 가능한 레이블(Labeled) 데이터를 자동으로 마련하여 지식베이스 프로퍼티에 대한 어휘화 작업을 수행한다. 즉, 두 개체 사이의 관계로 표현되는 서술어와, 온톨로지로 정의할 수 있는 프로퍼티와의 연결을 통해, 텍스트로부터 구조적 정보를 생성할 수 있는 기반을 마련하고 최종적으로 지식베이스 확장의 가능성을 열어준다.

Relation extraction plays a role in for the process of transforming a sentence into a form of knowledge base. In this paper, we focus on predicates in a sentence and aim to identify the relevant knowledge base properties required to elucidate the relationship between entities, which enables a computer to understand the meaning of a sentence more clearly. Distant Supervision is a well-known approach for relation extraction, and it performs lexicalization tasks for knowledge base properties by generating a large amount of labeled data automatically. In other words, the predicate in a sentence will be linked or mapped to the possible properties which are defined by some ontologies in the knowledge base. This lexical and ontological linking of information provides us with a way of generating structured information and a basis for enrichment of the knowledge base.

키워드

과제정보

연구 과제번호 : WiseKB: 빅데이터 이해 기반 자가학습형 지식베이스 및 추론 기술 개발

연구 과제 주관 기관 : 정보통신기술진흥센터

참고문헌

  1. Christian Bizer, Tom Heath, Tim Berners-Lee, "Linked Data:The Story So Far," Semantic Services, Interoperability and Web Applications: Emerging Concepts, 205-227, 2009.
  2. Luis Galárraga, Christina Teflioudi, Katja Hose, Fabian M. Suchanek, "AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases," Proc. of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013.
  3. Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, Zachary Ives, "DBpedia: A Nucleus for a Web of Open Data," Springer Berlin Heidelberg, 2007.
  4. Daniel Gerber, Axel Cyrille Ngonga Ngomo. "Bootstrapping the Linked Data Web," 1st Workshop on Web Scale Knowledge Extraction@ ISWC, Vol. 2011, 2011.
  5. Youngsik Kim, Key-Sun Choi, "Entity Linking Korean Text: An Unsupervised Learning Approach using Semantic Relation," CoNLL, 2015.
  6. Pablo N. Mendes, Max Jakob, Andrees Garcia-Silva, Christian Bizer, "DBpedia Spotlight: Shedding Light on the Web of Documents," Proc. of the 7th International Conference on Semantic Systems, ACM, 2011.
  7. Mike Mintz, Steven Bills, Rion Snow, Dan jurafsky, "Distant supervision for relation extraction without labeled data," Proc. of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2, Association for Computational Linguistics, 2009.
  8. Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu, "Learning Entity and Relation Embeddings for Knowledge Graph Completion," Proc. of AAAI, 2015.
  9. YeonMoon Woo, YoungIn Song, SoYoung Park, HaeChang Rim, "Modification Distance Model using Headible Path Contexts for Korean Dependency Parsing," Journal of KISS : Software and Applications 34.2, 140-149, 2007.
  10. Choi, Key-Sun, "CoreNet: Chinese-Japanese-Korean wordnet with shared semantic hierarchy," Natural Language Processing and Knowledge Engineering, 2003, Proc. 2003 International Conference on. IEEE, 2003.
  11. Yoshua Bengio, Rejean Ducharme, Pascal Vincent, Christian Jauvin, "A Neural Probabilistic Language Model," The Journal of Machine Learning Research 3, 1137-1155, 2003.
  12. Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv preprint arXiv:1301.3781, 2013.
  13. Seung-Hoon Na, Sangkeun Jung, "Deep Learning for Korean POS Tagging," KCC, 426-428, 2014.
  14. Changki Lee, Junseok Kim, Jeonghee Kim, Hyunki Kim, "Named Entity Recognition using Deep Learning," KCC, 423-425, 2014.
  15. Baker, Collin F., Charles J. Fillmore, and John B. Lowe," The berkeley framenet project," Proc. of the 17th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 1998.