DOI QR코드

DOI QR Code

Confidence Value based Large Scale OWL Horst Ontology Reasoning

신뢰 값 기반의 대용량 OWL Horst 온톨로지 추론

  • Received : 2015.10.26
  • Accepted : 2016.02.22
  • Published : 2016.05.15

Abstract

Several machine learning techniques are able to automatically populate ontology data from web sources. Also the interest for large scale ontology reasoning is increasing. However, there is a problem leading to the speculative result to imply uncertainties. Hence, there is a need to consider the reliability problems of various data obtained from the web. Currently, large scale ontology reasoning methods based on the trust value is required because the inference-based reliability of quantitative ontology is insufficient. In this study, we proposed a large scale OWL Horst reasoning method based on a confidence value using spark, a distributed in-memory framework. It describes a method for integrating the confidence value of duplicated data. In addition, it explains a distributed parallel heuristic algorithm to solve the problem of degrading the performance of the inference. In order to evaluate the performance of reasoning methods based on the confidence value, the experiment was conducted using LUBM3000. The experiment results showed that our approach could perform reasoning twice faster than existing reasoning systems like WebPIE.

웹으로부터 얻어진 데이터를 통해 자동적으로 온톨로지를 확장하는 많은 기계학습 방법들이 존재한다. 또한 대용량 온톨로지 추론에 대한 관심이 증가하고 있다. 하지만 웹으로부터 얻어진 다양한 데이터의 신뢰성 문제를 고려하지 않으면, 불확실성을 내포하는 추론결과를 초래하는 문제점이 있다. 현재 대용량 온톨로지의 신뢰도를 반영하는 추론에 대한 연구가 부족하기 때문에 신뢰 값 기반의 대용량 온톨로지 추론 방법론이 요구되고 있다. 본 논문에서는 인메모리 기반의 분산 클러스터 프레임워크인 스파크 환경에서 신뢰 값 기반의 대용량 OWL Horst 추론 방법에 대해서 설명한다. 기존의 연구들의 문제점인 중복 추론된 데이터의 신뢰 값을 통합하는 방법을 제안한다. 또한 추론의 성능을 저하시키는 문제를 해결할 수 있는 분산 병렬 추론 알고리즘을 설명한다. 본 논문에서 제안하는 신뢰 값 기반의 추론 방법의 성능을 평가하기 위해 LUBM3000을 대상으로 실험을 진행했고, 기존의 추론엔진인 WebPIE에 비해 약 2배 이상의 성능을 얻었다.

Keywords

Acknowledgement

Grant : WiseKB: 빅데이터 이해 기반 자가학습형 지식베이스 및 추론 기술 개발

Supported by : 정보통신기술진흥센터

References

  1. Auer, Soren, et al., "Dbpedia: A nucleus for a web of open data," Springer Berlin Heidelberg, 2007.
  2. Etzioni, Oren, et al., "Unsupervised named-entity extraction from the web: An experimental study," Artificial intelligence 165.1, pp. 91-134, 2005. https://doi.org/10.1016/j.artint.2005.03.001
  3. Carlson, Andrew, et al., "Toward an Architecture for Never-Ending Language Learning," AAAI, Vol. 5, 2010.
  4. Suchanek, Fabian M., Gjergji Kasneci, and Gerhard Weikum. "Yago: a core of semantic knowledge," Proc. of the 16th international conference on World Wide Web. ACM, 2007.
  5. Ahmad, Khurshid, and Lee Gillam, "Automatic ontology extraction from unstructured texts," On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE. Springer Berlin Heidelberg, pp. 1330-1346, 2005.
  6. Liu, Chang, et al., "Large scale fuzzy pd* reasoning using mapreduce," The Semantic Web-ISWC 2011, Springer Berlin Heidelberg, pp. 405-420, 2011.
  7. Urbani, Jacopo, "OWL reasoning with WebPIE: calculating the closure of 100 billion triples," The Semantic Web: Research and Applications. Springer Berlin Heidelberg, pp. 213-227, 2010.
  8. Liu, Chang, et al., "Fuzzy reasoning over RDF data using OWL vocabulary," Proc. of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Volume 01. IEEE Computer Society, 2011.
  9. Qi, Guilin, and Jianfeng Du, "Reasoning with Uncertain and Inconsistent OWL Ontologies," Springer Berlin Heidelberg, 2012.
  10. Zaharia, Matei, et al., "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing," Proc. of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association, 2012.
  11. ter Horst, Herman J. "Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary," Web Semantics: Science, Services and Agents on the World Wide Web 3.2, pp. 79-115, 2005. https://doi.org/10.1016/j.websem.2005.06.001
  12. Wilkinson, Kevin, and Kevin Wilkinson, "Jena property table implementation," 2006.
  13. Schatzle, Alexander, et al., "Sempala: Interactive SPARQL Query Processing on Hadoop," The Semantic Web-ISWC 2014, Springer International Publishing, pp. 164-179, 2014.
  14. Adams, J. Barclay, "Probabilistic reasoning and certainty factors," Rule-Based Expert Systems, pp. 263-271, 1984.
  15. Heckerman, David E., and Edward H. Shortliffe, "From certainty factors to belief networks," Artificial Intelligence in Medicine 4.1, pp. 35-52, 1992. https://doi.org/10.1016/0933-3657(92)90036-O
  16. Stoilos, Giorgos, and Giorgos Stamou, "Reasoning with fuzzy extensions of OWL and OWL 2," Knowledge and information systems 40.1, pp. 205-242, 2014. https://doi.org/10.1007/s10115-013-0641-y
  17. Liu, Chang, et al., "Large scale fuzzy pd* reasoning using mapreduce," The Semantic Web-ISWC 2011, Springer Berlin Heidelberg, pp. 405-420, 2011.
  18. Guo, Yuanbo, Zhengxiang Pan, and Jeff Heflin, "LUBM: A benchmark for OWL knowledge base systems," Web Semantics: Science, Services and Agents on the World Wide Web 3.2, pp. 158-182, 2005. https://doi.org/10.1016/j.websem.2005.06.005