• Title/Summary/Keyword: Query Pruning

Search Result 24, Processing Time 0.02 seconds

A Query Pruning Technique for Optimizing Regular Path Expressions in Semistructured Databases (준구조적 데이타베이스에서의 정규경로표현 최적화를 위한 질의전지 기법)

  • Park, Chang-Won;Jeong, Jin-Wan
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.217-229
    • /
    • 2002
  • Regular path expressions are primary elements for formulating queries over the semistructured data that does not assume the conventional schemas. In addition, the query pruning is an important optimization technique to avoid useless traversals in evaluating regular path expressions. However, the existing query pruning often fails to fully optimize multiple regular path expressions, and the previous methods that post-process the result of the existing query pruning must check exponential combinations of sub-results. In this paper, we present a new query pruning technique that consists of the preprocessing phase and the pruning phase. Our two-phase query pruning is affective in optimizing multiple regular path expressions, and is more scalable than the previous methods in that it never check the exponential combinations of sub-results.

k-Nearest Neighbor Query Processing in Multi-Dimensional Indexing Structures (다차원 인덱싱 구조에서의 k-근접객체질의 처리 방안)

  • Kim Byung Gon;Oh Sung Kyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.1 s.33
    • /
    • pp.85-92
    • /
    • 2005
  • Recently, query processing techniques for the multi-dimensional data like images have been widely used to perform content-based retrieval of the data . Range query and Nearest neighbor query are widely used multi dimensional queries . This paper Proposes the efficient pruning strategies for k-nearest neighbor query in R-tree variants indexing structures. Pruning strategy is important for the multi-dimensional indexing query processing so that search space can be reduced. We analyzed the Pruning strategies and perform experiments to show overhead and the profit of the strategies. Finally, we propose best use of the strategies.

  • PDF

AutoCor: A Query Based Automatic Acquisition of Corpora of Closely-related Languages

  • Dimalen, Davis Muhajereen D.;Roxas, Rachel Edita O.
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.146-154
    • /
    • 2007
  • AutoCor is a method for the automatic acquisition and classification of corpora of documents in closely-related languages. It is an extension and enhancement of CorpusBuilder, a system that automatically builds specific minority language corpora from a closed corpus, since some Tagalog documents retrieved by CorpusBuilder are actually documents in other closely-related Philippine languages. AutoCor used the query generation method odds ratio, and introduced the concept of common word pruning to differentiate between documents of closely-related Philippine languages and Tagalog. The performance of the system using with and without pruning are compared, and common word pruning was found to improve the precision of the system.

  • PDF

A Density-Based K-Nearest Neighbors Search Method

  • Jang I. S.;Min K.W.;Choi W.S
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.260-262
    • /
    • 2004
  • Spatial database system provides many query types and most of them are required frequent disk I/O and much CPU time. k-NN search is to find k-th closest object from the query point and up to now, several k-NN search methods have been proposed. Among these, MINMAX distance method has an aim not to visit unnecessary node by applying pruning technique. But this method access more disk than necessary while pruning unnecessary node. In this paper, we propose new k-NN search algorithm based on density of object. With this method, we predict the radius to be expected to contain k-NN object using density of data set and search those objects within this radius and then adjust radius if failed. Experimental results show that this method outperforms the previous MINMAX distance method. This algorithm visit fewer disks than MINMAX method by the factor of maximum $22\%\;and\;average\;6\%.$

  • PDF

Partial Image Retrieval Using an Efficient Pruning Method (효율적인 Pruning 기법을 이용한 부분 영상 검색)

  • 오석진;오상욱;김정림;문영식;설상훈
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.145-152
    • /
    • 2002
  • As the number of digital images available to users is exponentially growing due to the rapid development of digital technology, content-based image retrieval (CBIR) has been one of the most active research areas. A variety of image retrieval methods have been proposed, where, given an input query image, the images that are similar to the input are retrieved from an image database based on low-level features such as colors and textures. However, most of the existing retrieval methods did not consider the case when an input query image is a part of a whole image in the database due to the high complexity involved in partial matching. In this paper, we present an efficient method for partial image matching by using the histogram distribution relationships between query image and whole image. The proposed approach consists of two steps: the first step prunes the search space and the second step performs block-based retrieval using partial image matching to rank images in candidate set. The experimental results demonstrate the feasibility of the proposed algorithm after assuming that the response tune of the system is very high while retrieving only by using partial image matching without Pruning the search space.

An Efficient Angular Space Partitioning Based Skyline Query Processing Using Sampling-Based Pruning (데이터 샘플링 기반 프루닝 기법을 도입한 효율적인 각도 기반 공간 분할 병렬 스카이라인 질의 처리 기법)

  • Choi, Woosung;Kim, Minseok;Diana, Gromyko;Chung, Jaehwa;Jung, Soonyong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.1
    • /
    • pp.1-8
    • /
    • 2017
  • Given a multi-dimensional dataset of tuples, a skyline query returns a subset of tuples which are not 'dominated' by any other tuples. Skyline query is very useful in Big data analysis since it filters out uninteresting items. Much interest was devoted to the MapReduce-based parallel processing of skyline queries in large-scale distributed environment. There are three requirements to improve parallelism in MapReduced-based algorithms: (1) workload should be well balanced (2) avoid redundant computations (3) Optimize network communication cost. In this paper, we introduce MR-SEAP (MapReduce sample Skyline object Equality Angular Partitioning), an efficient angular space partitioning based skyline query processing using sampling-based pruning, which satisfies requirements above. We conduct an extensive experiment to evaluate MR-SEAP.

On Efficient Processing of Continuous Reverse Skyline Queries in Wireless Sensor Networks

  • Yin, Bo;Zhou, Siwang;Zhang, Shiwen;Gu, Ke;Yu, Fei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.4
    • /
    • pp.1931-1953
    • /
    • 2017
  • The reverse skyline query plays an important role in information searching applications. This paper deals with continuous reverse skyline queries in sensor networks, which retrieves reverse skylines as well as the set of nodes that reported them for continuous sampling epochs. Designing an energy-efficient approach to answer continuous reverse skyline queries is non-trivial because the reverse skyline query is not decomposable and a huge number of unqualified nodes need to report their sensor readings. In this paper, we develop a new algorithm that avoids transmission of updates from nodes that cannot influence the reverse skyline. We propose a data mapping scheme to estimate sensor readings and determine their dominance relationships without having to know the true values. We also theoretically analyze the properties for reverse skyline computation, and propose efficient pruning techniques while guaranteeing the correctness of the answer. An extensive experimental evaluation demonstrates the efficiency of our approach.

Efficient Query Retrieval from Social Data in Neo4j using LIndex

  • Mathew, Anita Brigit
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2211-2232
    • /
    • 2018
  • The unstructured and semi-structured big data in social network poses new challenges in query retrieval. This requirement needs to be met by introducing quality retrieval time measures like indexing. Due to the huge volume of data storage, there originate the need for efficient index algorithms to promote query processing. However, conventional algorithms fail to index the huge amount of frequently obtained information in real time and fall short of providing scalable indexing service. In this paper, a new LIndex algorithm, which is a heuristic on Lucene is built on Neo4jHA architecture that holds the social network Big data. LIndex is a flexible and simplified adaptive indexing scheme that ascendancy decomposed shortest paths around term neighbors as basic indexing unit. This newfangled index proves to be effectual in query space pruning of graph database Neo4j, scalable in index construction and deployment. A graph query is processed and optimized beyond the traditional Lucene in a time-based manner to a more efficient path method in LIndex. This advanced algorithm significantly reduces query fetch without compromising the quality of results in time. The experiments are conducted to confirm the efficiency of the proposed query retrieval in Neo4j graph NoSQL database.

A Method for k Nearest Neighbor Query of Line Segment in Obstructed Spaces

  • Zhang, Liping;Li, Song;Guo, Yingying;Hao, Xiaohong
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.406-420
    • /
    • 2020
  • In order to make up the deficiencies of the existing research results which cannot effectively deal with the nearest neighbor query based on the line segments in obstacle space, the k nearest neighbor query method of line segment in obstacle space is proposed and the STA_OLkNN algorithm under the circumstance of static obstacle data set is put forward. The query process is divided into two stages, including the filtering process and refining process. In the filtration process, according to the properties of the line segment Voronoi diagram, the corresponding pruning rules are proposed and the filtering algorithm is presented. In the refining process, according to the relationship of the position between the line segments, the corresponding distance expression method is put forward and the final result is obtained by comparing the distance. Theoretical research and experimental results show that the proposed algorithm can effectively deal with the problem of k nearest neighbor query of the line segment in the obstacle environment.

Reverse k-Nearest Neighbor Query Processing Method for Continuous Query Processing in Bigdata Environments (빅데이터 환경에서 연속 질의 처리를 위한 리버스 k-최근접 질의 처리 기법)

  • Lim, Jongtae;Park, Sunyong;Seo, Kiwon;Lee, Minho;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.454-462
    • /
    • 2014
  • With the development of location aware technologies and mobile devices, location-based services have been studied. To provide location-based services, many researchers proposed methods for processing various query types with Mapreduce(MR). One of the proposed methods, is a Reverse k-nearest neighbor(RkNN) query processing method with MR. However, the existing methods spend too much cost to process the continuous RkNN query. In this paper, we propose an efficient continuous RkNN query processing method with MR to resolve the problems of the existing methods. The proposed method uses the 60-degree-pruning method. The proposed method does not need to reprocess the query for continuous query processing because the proposed method draws and monitors the monitoring area including the candidate objects of a RkNN query. In order to show the superiority of the proposed method, we compare it with the query processing performance of the existing method.