• Title/Summary/Keyword: 대용량 온톨로지

Search Result 60, Processing Time 0.028 seconds

Cooperative Development Method for Construction of Large-scale Ontology (대용량 온톨로지 구축에 있어서 협력적 개발 방법)

  • Mon, Hong-Goo;Park, Dong-Hun;Cho, Yi-Hyon;Kwon, Hyuk-Chul
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.11a
    • /
    • pp.47-50
    • /
    • 2006
  • 최근 시맨틱웹의 중요성이 두드러지면서 다양한 분야에서 연구가 활발하게 진행되고 있다. 이렇듯 중요하게 두드러지고 있는 시맨틱웹의 연구가 진행되기 위해선 다양한 분야의 온톨로지가 필요하다. 현재 많은 온톨로지 구축 관련 도구들이 있어 온톨로지 구축에 편리함을 제공하고 있지만, 여전히 온톨로지의 구축은 많은 노력과 시간이 필요하다. 특히, 대용량의 온톨로지 구축은 더 많은 노력과 시간이 필요하다. 따라서 본 논문은 기본적으로 제시된 온톨로지 구축방법 모델들의 비교 분석을 통해 대용량 온톨로지 구축에 협력적 개발 과정을 제시하고자 한다.

  • PDF

Efficient OWL Ontology Storage Model based on Jena2 (Jena2 기반의 효율적인 OWL Ontology 관리를 위한 저장 모델)

  • Shin, Hee-Young;Jeong, Dong-Won;Kim, Jin-Hyung;Baik, Doo-Kwon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06c
    • /
    • pp.144-148
    • /
    • 2007
  • W3C에서 표준 온톨로지 언어로 OWL을 채택함에 따라 많은 온톨로지들이 OWL로 기술 및 구현되고 있다. 이에 따라 대용량의 OWL 문서를 효율적으로 저장하고, 검색할 수 있는 모델에 대한 필요성이 제기되고 있으며, Jena, $Prot{\acute{e}}g{\acute{e}}$, Sesame, FacT 등 다양한 프레임워크가 제안되어 활발히 연구가 진행되고 있다. 이 논문에서는 기본적인 Jena2의 저장소 모델이 단일 테이블에 문서의 정보를 저장하여 대용량의 OWL데이터의 처리에 있어 성능이 저하되는 문제점을 해결하여 대용량의 OWL 문서의 효율적인 저장, 관리, 질의 가능한 OWL 온톨로지 관계형 데이터베이스 모델을 제안한다. 또한 OWL 온톨로지 관계형 데이터베이스 모델을 위한 어댑터 및 컨버터를 제안한다.

  • PDF

A Design and Implementation of Table Structure and a System Based on Hive for Processing Large RDF Data (대용량 RDF 데이터 처리를 위한 Hive 기반 테이블 구조 및 시스템의 설계 및 구현)

  • Lee, Dae-Hee;Son, Young-Seok;Ha, Young-Guk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.255-257
    • /
    • 2015
  • 시맨틱 웹 기술 분야에서는 데이터를 온톨로지 형태로 표현함으로써 데이터의 의미를 인간뿐만 아니라 컴퓨터와 같은 기계도 이해할 수 있도록 한다. 이러한 온톨로지 데이터의 크기가 지속적으로 증가함에 따라 대용량 온톨로지 데이터 처리에 대한 필요성이 증가하고 있다. 이에 따라 본 논문에서는 대용량 온톨로지 데이터를 저장하고 질의를 할 수 있는 Hive 기반의 시스템을 제안한다. 또한 Hive에서 제공하는 파티셔닝을 이용하여 온톨로지 데이터에 대한 쿼리 반응 속도의 성능 향상을 위한 테이블 설계를 제안한다. 본 논문에서 제안하는 시스템의 성능 평가를 위하여 쿼리에 대한 반응 속도 측정을 수행한다.

Scalable Ontology Reasoning Using GPU Cluster Approach (GPU 클러스터 기반 대용량 온톨로지 추론)

  • Hong, JinYung;Jeon, MyungJoong;Park, YoungTack
    • Journal of KIISE
    • /
    • v.43 no.1
    • /
    • pp.61-70
    • /
    • 2016
  • In recent years, there has been a need for techniques for large-scale ontology inference in order to infer new knowledge from existing knowledge at a high speed, and for a diversity of semantic services. With the recent advances in distributed computing, developments of ontology inference engines have mostly been studied based on Hadoop or Spark frameworks on large clusters. Parallel programming techniques using GPGPU, which utilizes many cores when compared with CPU, is also used for ontology inference. In this paper, by combining the advantages of both techniques, we propose a new method for reasoning large RDFS ontology data using a Spark in-memory framework and inferencing distributed data at a high speed using GPGPU. Using GPGPU, ontology reasoning over high-capacity data can be performed as a low cost with higher efficiency over conventional inference methods. In addition, we show that GPGPU can reduce the data workload on each node through the Spark cluster. In order to evaluate our approach, we used LUBM ranging from 10 to 120. Our experimental results showed that our proposed reasoning engine performs 7 times faster than a conventional approach which uses a Spark in-memory inference engine.

A Scalable OWL Horst Lite Ontology Reasoning Approach based on Distributed Cluster Memories (분산 클러스터 메모리 기반 대용량 OWL Horst Lite 온톨로지 추론 기법)

  • Kim, Je-Min;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.307-319
    • /
    • 2015
  • Current ontology studies use the Hadoop distributed storage framework to perform map-reduce algorithm-based reasoning for scalable ontologies. In this paper, however, we propose a novel approach for scalable Web Ontology Language (OWL) Horst Lite ontology reasoning, based on distributed cluster memories. Rule-based reasoning, which is frequently used for scalable ontologies, iteratively executes triple-format ontology rules, until the inferred data no longer exists. Therefore, when the scalable ontology reasoning is performed on computer hard drives, the ontology reasoner suffers from performance limitations. In order to overcome this drawback, we propose an approach that loads the ontologies into distributed cluster memories, using Spark (a memory-based distributed computing framework), which executes the ontology reasoning. In order to implement an appropriate OWL Horst Lite ontology reasoning system on Spark, our method divides the scalable ontologies into blocks, loads each block into the cluster nodes, and subsequently handles the data in the distributed memories. We used the Lehigh University Benchmark, which is used to evaluate ontology inference and search speed, to experimentally evaluate the methods suggested in this paper, which we applied to LUBM8000 (1.1 billion triples, 155 gigabytes). When compared with WebPIE, a representative mapreduce algorithm-based scalable ontology reasoner, the proposed approach showed a throughput improvement of 320% (62k/s) over WebPIE (19k/s).

Extended Ontology Model based on DBMS (DBMS 기반의 온톨로지 확장 모델)

  • Lee, Mi-Kyoung;Kim, Pyung;Jung, Han-Min;Sung, Won-Kyung
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10b
    • /
    • pp.284-288
    • /
    • 2006
  • 본 논문은 시맨틱 웹 기술이 융합된 지식기반 정보유통 플랫폼(OntoFrame-K$^{(R)}$)의 추론 서비스 시스템 (OntoThink-K$^{(R)}$)에서 이용되는 Persistent Model인 DBMS기반의 온톨로지 확장 모델에 대해 설명하고자 한다. OntoFrame-K$^{(R)}$는 대용량의 지식 데이터를 다루기 때문에 기존에 개발된 온톨로지 추론 엔진을 이용할 경우 많은 한계점을 가지게 된다. 따라서 우리는 대용량의 지식 데이터를 안정적으로 처리할 수 있으며 추론의 신뢰성과 정합성을 가지는 온톨로지 확장 모델을 설계, 구현하였다. 본 모듈은 OWL과 인스턴스 데이터를 트리플 형태로 변환하여 입력 받은 후, 온톨로지 스키마 규칙과 사용자 정의 규칙을 이용한 정방향 추론 방법으로 추론 서비스에서 필요한 지식데이터들을 생성하는 역할을 한다. 본 모델은 DBMS를 이용하여 대용량의 지식 데이터를 저장할 수 있으며, 추론 규칙에 따른 정방향 추론을 통해 지식 모델을 확장하기 때문에 데이터의 정합성이 보장된다.

  • PDF

Confidence Value based Large Scale OWL Horst Ontology Reasoning (신뢰 값 기반의 대용량 OWL Horst 온톨로지 추론)

  • Lee, Wan-Gon;Park, Hyun-Kyu;Jagvaral, Batselem;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.5
    • /
    • pp.553-561
    • /
    • 2016
  • Several machine learning techniques are able to automatically populate ontology data from web sources. Also the interest for large scale ontology reasoning is increasing. However, there is a problem leading to the speculative result to imply uncertainties. Hence, there is a need to consider the reliability problems of various data obtained from the web. Currently, large scale ontology reasoning methods based on the trust value is required because the inference-based reliability of quantitative ontology is insufficient. In this study, we proposed a large scale OWL Horst reasoning method based on a confidence value using spark, a distributed in-memory framework. It describes a method for integrating the confidence value of duplicated data. In addition, it explains a distributed parallel heuristic algorithm to solve the problem of degrading the performance of the inference. In order to evaluate the performance of reasoning methods based on the confidence value, the experiment was conducted using LUBM3000. The experiment results showed that our approach could perform reasoning twice faster than existing reasoning systems like WebPIE.

An Approach of Scalable SHIF Ontology Reasoning using Spark Framework (Spark 프레임워크를 적용한 대용량 SHIF 온톨로지 추론 기법)

  • Kim, Je-Min;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.10
    • /
    • pp.1195-1206
    • /
    • 2015
  • For the management of a knowledge system, systems that automatically infer and manage scalable knowledge are required. Most of these systems use ontologies in order to exchange knowledge between machines and infer new knowledge. Therefore, approaches are needed that infer new knowledge for scalable ontology. In this paper, we propose an approach to perform rule based reasoning for scalable SHIF ontologies in a spark framework which works similarly to MapReduce in distributed memories on a cluster. For performing efficient reasoning in distributed memories, we focus on three areas. First, we define a data structure for splitting scalable ontology triples into small sets according to each reasoning rule and loading these triple sets in distributed memories. Second, a rule execution order and iteration conditions based on dependencies and correlations among the SHIF rules are defined. Finally, we explain the operations that are adapted to execute the rules, and these operations are based on reasoning algorithms. In order to evaluate the suggested methods in this paper, we perform an experiment with WebPie, which is a representative ontology reasoner based on a cluster using the LUBM set, which is formal data used to evaluate ontology inference and search speed. Consequently, the proposed approach shows that the throughput is improved by 28,400% (157k/sec) from WebPie(553/sec) with LUBM.

Ontology-based Cohort DB Search Simulation (온톨로지 기반 대용량 코호트 DB 검색 시뮬레이션)

  • Song, Joo-Hyung;Hwang, Jae-min;Choi, Jeongseok;Kang, Sanggil
    • Journal of the Korea Society for Simulation
    • /
    • v.25 no.1
    • /
    • pp.29-34
    • /
    • 2016
  • Many researchers have used cohort DB (database) to predict the occurrence of disease or to keep track of disease spread. Cohort DB is Big Data which has simply stored disease and health information as separated DB table sets. To measure the relations between health information, It is necessary to reconstruct cohort DB which follows research purpose. In this paper, XML descriptor, editor has been used to construct ontology-based Big Data cohort DB. Also, we have developed ontology based cohort DB search system to check results of relations between health information. XML editor has used 7 layered Ontology development 101 and OWL API to change cohort DB into ontology-based. Ontology-based cohort DB system can measure the relation of disease and health information and can be used effectively when semantic relations are found. We have developed ontology-based cohort DB search system which can measure the relations between disease and health information. And it is very effective when searched results are semantic relations.

Scalable RDFS Reasoning using Logic Programming Approach in a Single Machine (단일머신 환경에서의 논리적 프로그래밍 방식 기반 대용량 RDFS 추론 기법)

  • Jagvaral, Batselem;Kim, Jemin;Lee, Wan-Gon;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.762-773
    • /
    • 2014
  • As the web of data is increasingly producing large RDFS datasets, it becomes essential in building scalable reasoning engines over large triples. There have been many researches used expensive distributed framework, such as Hadoop, to reason over large RDFS triples. However, in many cases we are required to handle millions of triples. In such cases, it is not necessary to deploy expensive distributed systems because logic program based reasoners in a single machine can produce similar reasoning performances with that of distributed reasoner using Hadoop. In this paper, we propose a scalable RDFS reasoner using logical programming methods in a single machine and compare our empirical results with that of distributed systems. We show that our logic programming based reasoner using a single machine performs as similar as expensive distributed reasoner does up to 200 million RDFS triples. In addition, we designed a meta data structure by decomposing the ontology triples into separate sectors. Instead of loading all the triples into a single model, we selected an appropriate subset of the triples for each ontology reasoning rule. Unification makes it easy to handle conjunctive queries for RDFS schema reasoning, therefore, we have designed and implemented RDFS axioms using logic programming unifications and efficient conjunctive query handling mechanisms. The throughputs of our approach reached to 166K Triples/sec over LUBM1500 with 200 million triples. It is comparable to that of WebPIE, distributed reasoner using Hadoop and Map Reduce, which performs 185K Triples/sec. We show that it is unnecessary to use the distributed system up to 200 million triples and the performance of logic programming based reasoner in a single machine becomes comparable with that of expensive distributed reasoner which employs Hadoop framework.