• Title/Summary/Keyword: 인메모리 기반 분산 질의 엔진

Search Result 2, Processing Time 0.014 seconds

SPARQL Query Processing System over Scalable Triple Data using SparkSQL Framework (SparQLing : SparkSQL 기반 대용량 트리플 데이터를 위한 SPARQL 질의 시스템 구축)

  • Jeon, MyungJoong;Hong, JinYoung;Park, YoungTack
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.450-459
    • /
    • 2016
  • Every year, RDFS data tends further toward scalability; hence, the manner of SPARQL processing needs to be changed for fast query. The query processing method of SPARQL has been studied using a scalable distributed processing framework. Current studies indicate that the query engine based on the scalable distributed processing framework i.e., Hadoop(MapReduce) is not suitable for real-time processing because of the repetitive tasks; in addition, it is difficult to construct a query engine based on an In-memory Distributed Query engine, because distributed structure on the low-level is required to be considered. In this paper, we proposed a method to construct a query engine for improving the speed of the query process with the mass triple data. The query engine processes the query of SPARQL using the SparkSQL, which is an In-memory based, distributed query processing framework. SparkSQL is a high-level distributed query engine that facilitates existing SQL statement. In order to process the SPARQL query, after generating the Algebra Tree using Jena, the Algebra Tree is required to be translated to Spark Algebra Tree for application in the Spark system, and construction of the system that generated the SparkSQL query. Furthermore, we proposed the design of triple property table based on DataFrame for more efficient query processing in the Spark system. Finally, we verified the validity through comparative evaluation with the query engine, which is the existing distributed processing framework.

Distributed In-Memory based Large Scale RDFS Reasoning and Query Processing Engine for the Population of Temporal/Spatial Information of Media Ontology (미디어 온톨로지의 시공간 정보 확장을 위한 분산 인메모리 기반의 대용량 RDFS 추론 및 질의 처리 엔진)

  • Lee, Wan-Gon;Lee, Nam-Gee;Jeon, MyungJoong;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.9
    • /
    • pp.963-973
    • /
    • 2016
  • Providing a semantic knowledge system using media ontologies requires not only conventional axiom reasoning but also knowledge extension based on various types of reasoning. In particular, spatio-temporal information can be used in a variety of artificial intelligence applications and the importance of spatio-temporal reasoning and expression is continuously increasing. In this paper, we append the LOD data related to the public address system to large-scale media ontologies in order to utilize spatial inference in reasoning. We propose an RDFS/Spatial inference system by utilizing distributed memory-based framework for reasoning about large-scale ontologies annotated with spatial information. In addition, we describe a distributed spatio-temporal SPARQL parallel query processing method designed for large scale ontology data annotated with spatio-temporal information. In order to evaluate the performance of our system, we conducted experiments using LUBM and BSBM data sets for ontology reasoning and query processing benchmark.