• Title/Summary/Keyword: Query Processing Method

Search Result 532, Processing Time 0.043 seconds

SPARQL Query Processing System over Scalable Triple Data using SparkSQL Framework (SparQLing : SparkSQL 기반 대용량 트리플 데이터를 위한 SPARQL 질의 시스템 구축)

  • Jeon, MyungJoong;Hong, JinYoung;Park, YoungTack
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.450-459
    • /
    • 2016
  • Every year, RDFS data tends further toward scalability; hence, the manner of SPARQL processing needs to be changed for fast query. The query processing method of SPARQL has been studied using a scalable distributed processing framework. Current studies indicate that the query engine based on the scalable distributed processing framework i.e., Hadoop(MapReduce) is not suitable for real-time processing because of the repetitive tasks; in addition, it is difficult to construct a query engine based on an In-memory Distributed Query engine, because distributed structure on the low-level is required to be considered. In this paper, we proposed a method to construct a query engine for improving the speed of the query process with the mass triple data. The query engine processes the query of SPARQL using the SparkSQL, which is an In-memory based, distributed query processing framework. SparkSQL is a high-level distributed query engine that facilitates existing SQL statement. In order to process the SPARQL query, after generating the Algebra Tree using Jena, the Algebra Tree is required to be translated to Spark Algebra Tree for application in the Spark system, and construction of the system that generated the SparkSQL query. Furthermore, we proposed the design of triple property table based on DataFrame for more efficient query processing in the Spark system. Finally, we verified the validity through comparative evaluation with the query engine, which is the existing distributed processing framework.

Efficient Spatial Query Processing in Constraint Databases (제약 데이터베이스에서의 효율적인 공간질의 처리)

  • Woo, Sung-Koo;Ryu, Keun-Ho
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.79-86
    • /
    • 2009
  • The tuple of constraint database consists of constraint logical formula and it could process the presentation and query of the constraint database simply. Query operation processing shall include the constraint formula between related tuple such as selection, union, intersection of spatial data through the constraint database. However, this could produce the increasing of duplicated or unnecessary data. Hence, it will drive up the cost as per quality. This paper identified problems regarding query processing result in the constraint database. Also this paper suggested the tuple minimization summary method for result relation and analyzed the effects for efficient query processing. We were able to identify that the effectiveness of the query processing was enhanced by eliminating unnecessary constraint formula of constraint relation using the tuple minimization method.

  • PDF

Design and Implementation of Load Balancing Method for Efficient Spatial Query Processing in Clustering Environment (클러스터링 환경에서 효율적인 공간 질의 처리를 위한 로드 밸런싱 기법의 설계 및 구현)

  • 김종훈;이찬구;정현민;정미영;배영호
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.3
    • /
    • pp.384-396
    • /
    • 2003
  • Hybrid query processing method is used for preventing server overload that is created by heavy user connection in Web GIS. In Hybrid query processing method, both server and client participate in spatial query processing. But, Hybrid query processing method is restricted in scalability of server and it can't be fundamentally solution for server overload. So, it is necessary for Web GIS to be brought in web clustering technique. In this thesis, we propose load-balancing method that uses proximity of query region. In this paper, we create tile groups that have relation each tile in same group is very close, and forward client request to the server that can have maximum rate of buffer reuse with considering characteristic of spatial query. With out load balancing method, buffet in server is optimized for exploring spatial index tree and increase rate of buffer reuse, so it can be reduced amount of disk access and increase system performance.

  • PDF

Efficient Skyline Query Processing Scheme in Mobile P2P Networks (모바일 P2P 네트워크에서 효율적인 스카이라인 질의 처리 기법)

  • Bok, Kyoung-Soo;Park, Sun-Yong;Kim, Dae-Yeon;Lim, Jong-Tae;Shin, Jae-Ryong;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.30-42
    • /
    • 2015
  • In this paper, we propose a new skyline query processing scheme to enhance accuracy of query processing and communication cost in mobile P2P environments. The proposed scheme consists of three stages such as the pre-skyline processing, the query transmission range extension policy, and the continuous skyline query processing. In the pre-skyline processing, a peer selects the candidate filtering objects who have the potential to be selected. By doing so, the proposed scheme reduces the filtering cost when processing the query. In the query transmission range extension policy, we have improved the accuracy by extending the query transmission range. In addition, it can handle continuous skyline query by performing the monitoring after the first skyline query processing. In order to show the superiority of the proposed method, we compare it with the existing schemes through performance evaluation. As a result, it was shown that the proposed scheme outperforms the existing schemes.

An Index Structure for Efficient X-Path Processing on S-XML Data (S-XML 데이터의 효율적인 X-Path 처리를 위한 색인 구조)

  • Zhang, Gi;Jang, Yong-Il;Park, Soon-Young;Oh, Young-Hwan;Bae, Hae-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.51-54
    • /
    • 2005
  • This paper proposes an index structure which is used to process X-Path on S-XML data. There are many previous index structures based on tree structure for X-Path processing. Because of general tree index's top-down query fashion, the unnecessary node traversal makes heavy access and decreases the query processing performance. And both of the two query types for X-Path called single-path query and branching query need to be supported in proposed index structure. This method uses a combination of path summary and the node indexing. First, it manages hashing on hierarchy elements which are presented in tag in S-XML. Second, array blocks named path summary array is created in each node of hashing to store the path information. The X-Path processing finds the tag element using hashing and checks array blocks in each node to determine the path of query's result. Based on this structure, it supports both single-path query and branching path query and improves the X-Path processing performance.

  • PDF

Cost-based Optimization of Extended Boolean Queries (확장 불리언 질의에 대한 비용 기반 최적화)

  • 박병권
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.3
    • /
    • pp.29-40
    • /
    • 2001
  • In this paper, we suggest a query optimization algorithm to select the optimal processing method of an extended boolean query on inverted files. There can be a lot of methods for processing an extended boolean query according to the processing sequence oh the keywords con tamed in the query, In this sense, the problem of optimizing an extended boolean query it essentially that of optimizing the keyword sequence in the query. In this paper, we show that the problem is basically analogous to the problem of finding the optimal join order in database query optimization, and apply the ideas in the area to the problem solving. We establish the cost model for processing an extended boolean query and develop an algorithm to filled the optimal keyword-processing sequence based on the concept of keyword rank using the keyword selectivity and the access costs of inverted file. We prove that the method selected by the optimization algorithm is really optimum, and show, through experiments, that the optimal method is superior to the others in performance We believe that the suggested optimization algorithm will contribute to the significant enhancement of the information retrieval performance.

  • PDF

Efficient Query Indexing for Short Interval Query (짧은 구간을 갖는 범위 질의의 효율적인 질의 색인 기법)

  • Kim, Jae-In;Song, Myung-Jin;Han, Dae-Young;Kim, Dae-In;Hwang, Bu-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.507-516
    • /
    • 2009
  • In stream data processing system, generally the interval queries are in advance registered in the system. When a data is input to the system continuously, for realtime processing, a query indexing method is used to quickly search queries. Thus, a main memory-based query index with a small storage cost and a fast search time is needed for searching queries. In this paper, we propose a LVC-based(Limited Virtual Construct-based) query index method using a hashing to meet the both needs. In LVC-based query index, we divide the range of a stream into limited virtual construct, or LVC. We map each interval query to its corresponding LVC and the query ID is stored on each LVC. We have compared with the CEI-based query indexing method through the simulation experiment. When the range of values of input stream is broad and there are many short interval queries, the LVC-based indexing method have shown the performance enhancement for the storage cost and search time.

Efficient k-Nearest Neighbor Query Processing Method for a Large Location Data (대용량 위치 데이터에서 효율적인 k-최근접 질의 처리 기법)

  • Choi, Dojin;Lim, Jongtae;Yoo, Seunghun;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.8
    • /
    • pp.619-630
    • /
    • 2017
  • With the growing popularity of smart devices, various location based services have been providing to users. Recently, some location based social applications that combine social services and location based services have been emerged. The demands of a k-nearest neighbors(k-NN) query which finds k closest locations from a user location are increased in the location based social network services. In this paper, we propose an approximate k-NN query processing method for fast response time in a large number of users environments. The proposed method performs efficient stream processing using big data distributed processing technologies. In this paper, we also propose a modified grid index method for indexing a large amount of location data. The proposed query processing method first retrieves the related cells by considering a user movement. By doing so, it can make an approximate k results set. In order to show the superiority of the proposed method, we conduct various performance evaluations with the existing method.

XQuery Query Rewriting for Query Optimization in Distributed Environments (분산 환경에 질의 최적화를 위한 XQuery 질의 재작성)

  • Park, Jong-Hyun;Kang, Ji-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.3
    • /
    • pp.1-11
    • /
    • 2009
  • XQuery query proposed by W3C is one of the standard query languages for XML data and is widely accepted by many applications. Therefore the studies for efficient Processing of XQuery query have become a topic of critical importance recently and the optimization of XQuery query is one of new issues in these studies. However, previous researches just focus on the optimization techniques for a specific XML data management system and these optimization techniques can not be used under the any XML data management systems. Also, some previous researches use predefined XML data structure information such as XML schema or DTD for the optimization. In the real situation, however applications do not all refer to the structure information for XML data. Therefore, this paper analyzes only a XQuery query and optimize by using itself of the XQuery query. In this paper, we propose 3 kinds of optimization method that considers the characteristic of XQuery query. First method removes the redundant expressions described in XQuery query second method replaces the processing order of operation and clause in XQuery query and third method rewrites the XQuery query based on FOR clause. In case of third method, we consider FOR clause because generally FOR clause generates a loop in XQuery query and the loop often rises to execution frequency of redundant operation. Through a performance evaluation, we show that the processing time for rewritten queries is less than for original queries. also each method in our XQuery query optimizer can be used separately because the each method is independent.

In-network Aggregation Query Processing using the Data-Loss Correction Method in Data-Centric Storage Scheme (데이터 중심 저장 환경에서 소설 데이터 보정 기법을 이용한 인-네트워크 병합 질의 처리)

  • Park, Jun-Ho;Lee, Hyo-Joon;Seong, Dong-Ook;Yoo, Jae-Soo
    • Journal of KIISE:Databases
    • /
    • v.37 no.6
    • /
    • pp.315-323
    • /
    • 2010
  • In Wireless Sensor Networks (WSNs), various Data-Centric Storages (DCS) schemes have been proposed to store the collected data and to efficiently process a query. A DCS scheme assigns distributed data regions to sensor nodes and stores the collected data to the sensor which is responsible for the data region to process the query efficiently. However, since the whole data stored in a node will be lost when a fault of the node occurs, the accuracy of the query processing becomes low, In this paper, we propose an in-network aggregation query processing method that assures the high accuracy of query result in the case of data loss due to the faults of the nodes in the DCS scheme. When a data loss occurs, the proposed method creates a compensation model for an area of data loss using the linear regression technique and returns the result of the query including the virtual data. It guarantees the query result with high accuracy in spite of the faults of the nodes, To show the superiority of our proposed method, we compare E-KDDCS (KDDCS with the proposed method) with existing DCS schemes without the data-loss correction method. In the result, our proposed method increases accuracy and reduces query processing costs over the existing schemes.