DOI QR코드

DOI QR Code

Incremental Enrichment of Ontologies through Feature-based Pattern Variations

자질별 관계 패턴의 다변화를 통한 온톨로지 확장

  • 이신목 (한국과학기술원 전자전산학과) ;
  • 장두성 (KT 미래기술연구소) ;
  • 신지애 (한국정보통신대학교 공학부)
  • Published : 2008.08.29

Abstract

In this paper, we propose a model to enrich an ontology by incrementally extending the relations through variations of patterns. In order to generalize initial patterns, combinations of features are considered as candidate patterns. The candidate patterns are used to extract relations from Wikipedia, which are sorted out according to reliability based on corpus frequency. Selected patterns then are used to extract relations, while extracted relations are again used to extend the patterns of the relation. Through making variations of patterns in incremental enrichment process, the range of pattern selection is broaden and refined, which can increase coverage and accuracy of relations extracted. In the experiments with single-feature based pattern models, we observe that the features of lexical, headword, and hypernym provide reliable information, while POS and syntactic features provide general information that is useful for enrichment of relations. Based on observations on the feature types that are appropriate for each syntactic unit type, we propose a pattern model based on the composition of features as our ongoing work.

본 논문에서는 패턴의 다변화를 통하여 관계를 점진적으로 추출함으로써 온톨로지를 확장하는 모델을 제안한다. 패턴 다변화 과정에서 위키피디아로부터 추출한 관계 패턴 후보를 자질별로 다변화시킨다. 다변화된 패턴 후보로부터 말뭉치 빈도수에 따른 신뢰도를 이용하여 패턴을 선별한다. 선별된 패턴은 위키피디아로부터 관계를 추출하는 데 사용되며, 추출된 관계는 다시 관계 패턴 확장에 사용된다. 본 논문에서는 점진적 학습 과정에서의 패턴 다변화를 통하여 패턴 선택의 범위를 확장함으로써, 선택되는 패턴이 점진적으로 정제되는 모델을 제시한다. 이를 통하여, 관계의 확장성과 정확도를 향상시키고자 하였다. 단일 자질 패턴 모델에 대한 실험을 통하여, 어휘, 중심어, 상위어 정보는 신뢰도에, 품사, 구문 정보는 확장성에 유리하며, 구문 단위 유형별로 필요한 자질 유형이 다름을 관찰하였다. 이와 같은 특성에 기반하여 현재 연구 진행 중인복합 자질 패턴 모델을 제안한다.

Keywords

References

  1. M. Ciaramita, A. Gangemi, E. Ratsch, J. Saric and I. Rojas, “Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology,” in 19th International Joint Conference on Artificial Intelligence, 2005
  2. C. Ramakrishnan, K. J. Kochut and A. P. Sheth, “A Framework for Schema-Driven Relationship Discovery from Unstructured Text,” International Semantic Web Conference, 2006
  3. R. Gaizauskas, G. Demetriou, P. J. Artymiuk and P. Willett, “Protein Structures and Information Extraction from Biological Texts:The PASTA System,” Bioinformatics, Vol.19, Issue1, pp.135-143, 2003 https://doi.org/10.1093/bioinformatics/19.1.135
  4. C. Friedman, P. kra, H. Yu, M. Krauthammer and A. Rzhetsky, “GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles,” Bioinformatics, Vol.17, pp.1367-4803, 2001
  5. D. Zelenko, C. Aone and A. Richardella, “Kernel Methods for Relation Extraction,” Journal of Machine Learning Research, Vol.3, pp.1083-1106, 2003 https://doi.org/10.1162/153244303322533205
  6. L. Specia and E. Motta, “A hybrid approach for extracting semantic relations from texts,” 2nd Workshop on Ontology Learning and Population (OLP2) at COLING/ACL 2006, pp.57-64. 2006
  7. P. Ryu and K. Choi, “Automatic Acquisition of Ranked IS-A Relation from Unstructured Text,” Proceedings of the Workshop on From Text to Knowledge: The Lexicon/Ontology Interface, the $6^{th}$ ISWC and ASWC, pp67-77, 2007
  8. F. Luccio, A. M. Enriquez, P. O. Rieumont and L. Pagli, “Exact rooted subtree matching in sublinear time,” Technical Report TR-01-14, Universita Di Pisa, 2001
  9. A. Schutz and P. Buitelaar, “RelExt: A tool for relation extraction in ontology extension,” in the Proceedings of the Fourth International Semantic Web Conference, pp.593-606. 2005
  10. A. Yakushiji, “Relation Information Extraction Using Deep Syntactic Analysis,“ Ph.D. Thesis. University of Tokyo, 2006
  11. G. Wang, Y. Yu and H. Zhu, “PORE: Positive-Only Relation Extraction from Wikipedia Text,” in the Proceedings of the Sixth International Semantic Web Conference, pp.580-594, 2007
  12. D. Chang and K. Choi, “Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities,” Information & Processing Management, Vol.42, Issue3, pp.662-678, 2006 https://doi.org/10.1016/j.ipm.2005.04.004
  13. R. Girju, Automatic “Detection of Causal Relations for Question Answering,” Proceedings of the 41st ACL, Workshop on Multilingual Summarization and Question Answering, 2003
  14. R. Girju and D. Moldovan, “Mining Answers for Causation Question,” AAAI Symposium on Mining Answers from Texts and Knowledge Bases, 2002
  15. S. Lee and H. Kim, “Pattern-based Extraction of Causal Relations in Korean,” accepted for publication of Proceeding of 2008 International Conference on Artificial Intelligence and Pattern Recognition(AIPR-08), 2008
  16. C. S. G. Khoo, J. Kornfilt, R. N. Oddy and S. H. Myaeng, “Automatic Extraction of Cause-Effect Information from Newspaper Text without Knowledge-based Inferencing,” Literary and Linguistic Computing, Vol.13, Issue4, pp.177-186, 1998 https://doi.org/10.1093/llc/13.4.177
  17. P. Pantel and M. Pennacchiotti, “Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations,” Joint conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics, 2006
  18. J. Huang, J. Shin and K. Choi, “Enriching Core Ontology with Domain Thesaurus through Concept and Relation Classification,” OntoLex Workshop, ISWC, 2007
  19. 이신목, 신지애, “전자장비 고장진단 질의응답을 위한 인과관계 정의 및 추출”, 한국정보과학회 논문지 소프트웨어 및 응용, 제35권 5호, pp.335-346, 2008

Cited by

  1. The Method of Verification for Legal Admissibility of Digital Evidence using the Digital Forensics Ontology vol.16D, pp.2, 2009, https://doi.org/10.3745/KIPSTD.2009.16-D.2.265