• Title/Summary/Keyword: Multiple Query Optimization

Search Result 20, Processing Time 0.03 seconds

Attribute-based Approach for Multiple Continuous Queries over Data Streams (데이터 스트림 상에서 다중 연속 질의 처리를 위한 속성기반 접근 기법)

  • Lee, Hyun-Ho;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.14D no.5
    • /
    • pp.459-470
    • /
    • 2007
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Query processing for such a data stream should also be continuous and rapid, which requires strict time and space constraints. In most DSMS(Data Stream Management System), the selection predicates of continuous queries are grouped or indexed to guarantee these constraints. This paper proposes a new scheme tailed an ASC(Attribute Selection Construct) that collectively evaluates selection predicates containing the same attribute in multiple continuous queries. An ASC contains valuable information, such as attribute usage status, partially pre calculated matching results and selectivity statistics for its multiple selection predicates. The processing order of those ASC's that are corresponding to the attributes of a base data stream can significantly influence the overall performance of multiple query evaluation. Consequently, a method of establishing an efficient evaluation order of multiple ASC's is also proposed. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

Optimizing Multi-way Join Query Over Data Streams (데이타 스트림에서의 다중 조인 질의 최적화 방법)

  • Park, Hong-Kyu;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.35 no.6
    • /
    • pp.459-468
    • /
    • 2008
  • A data stream which is a massive unbounded sequence of data elements continuously generated at a rapid rate. Many recent research activities for emerging applications often need to deal with the data stream. Such applications can be web click monitoring, sensor data processing, network traffic analysis. telephone records and multi-media data. For this. data processing over a data stream are not performed on the stored data but performed the newly updated data with pre-registered queries, and then return a result immediately or periodically. Recently, many studies are focused on dealing with a data stream more than a stored data set. Especially. there are many researches to optimize continuous queries in order to perform them efficiently. This paper proposes a query optimization algorithm to manage continuous query which has multiple join operators(Multi-way join) over data streams. It is called by an Extended Greedy query optimization based on a greedy algorithm. It defines a join cost by a required operation to compute a join and an operation to process a result and then stores all information for computing join cost and join cost in the statistics catalog. To overcome a weak point of greedy algorithm which has poor performance, the algorithm selects the set of operators with a small lay, instead of operator with the smallest cost. The set is influenced the accuracy and execution time of the algorithm and can be controlled adaptively by two user-defined values. Experiment results illustrate the performance of the EGA algorithm in various stream environments.

Query Optimization with Metadata Routing Tables on Nano-Q+ Sensor Network with Multiple Heterogeneous Sensors (다중 이기종 센서를 보유한 Nano-Q+ 기반 센서네트워크에서 메타데이타 라우팅 테이블을 이용한 질의 최적화)

  • Nam, Young-Kwang;Choe, Gui-Ja;Lee, Byoung-Dai;Kwak, Kwang-Woong;Lee, Kwang-Yong;Mah, Pyoung-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.13-21
    • /
    • 2008
  • In general, data communication among sensor nodes requires more energy than internal processing or sensing activities. In this paper, we propose a noble technique to reduce the number of packet transmissions necessary for sending/receiving queries/results among neighboring nodes with the help of context-aware routing tables. The important information maintained in the context-aware routing table is which physical properties can be measured by descendent nodes reachable from the current node. Based on the information, the node is able to eliminate unnecessary packet transmission by filtering out the child nodes for query dissemination or result relaying. The simulation results show that up to 80% of performance gains can be achieved with our technique.

Linear Resource Sharing Method for Query Optimization of Sliding Window Aggregates in Multiple Continuous Queries (다중 연속질의에서 슬라이딩 윈도우 집계질의 최적화를 위한 선형 자원공유 기법)

  • Baek, Seong-Ha;You, Byeong-Seob;Cho, Sook-Kyoung;Bae, Hae-Young
    • Journal of KIISE:Databases
    • /
    • v.33 no.6
    • /
    • pp.563-577
    • /
    • 2006
  • A stream processor uses resource sharing method for efficient of limited resource in multiple continuous queries. The previous methods process aggregate queries to consist the level structure. So insert operation needs to reconstruct cost of the level structure. Also a search operation needs to search cost of aggregation information in each size of sliding windows. Therefore this paper uses linear structure for optimization of sliding window aggregations. The method comprises of making decision, generation and deletion of panes in sequence. The decision phase determines optimum pane size for holding accurate aggregate information. The generation phase stores aggregate information of data per pane from stream buffer. At the deletion phase, panes are deleted that are no longer used. The proposed method uses resources less than the method where level structures were used as data structures as it uses linear data format. The input cost of aggregate information is saved by calculating only pane size of data though numerous stream data is arrived, and the search cost of aggregate information is also saved by linear searching though those sliding window size is different each other. In experiment, the proposed method has low usage of memory and the speed of query processing is increased.

Multi -Query Processing using the Grid Structure in Wireless Sensor Networks (무선 센서 네트워크 환경에서 그리드 구조를 이용한 다중 질의 처리 기법)

  • Kang, Gwang-Goo;Seong, Dong-Ook;Yoo, Jae-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.11
    • /
    • pp.1086-1090
    • /
    • 2010
  • In recent, as many applications of sensor networks increase, various techniques have been studied to efficiently operate network systems. The query optimization scheme that is one of such techniques has been studied to reduce the data transmission cost. The data transmission is of great importance to the energy consumption of sensor networks. In this paper, we propose an energy-efficient multiple queries processing scheme by sharing sensor readings for multiple queries, when they are occurred in sensor networks. The proposed scheme reduces unnecessary data transmissions among the sensor nodes by intuitively identifying their locations using the grid structure. It also efficiently shares the data by recognizing the redundant regions of sensor nodes. In order to show the superiority of the proposed scheme, we compare it with the existing scheme in various experiments. As the result, the proposed scheme reduces about 65% energy consumption over the existing scheme.

Optimization of Post-Processing for Subsequence Matching in Time-Series Databases (시계열 데이터베이스에서 서브시퀀스 매칭을 위한 후처리 과정의 최적화)

  • Kim, Sang-Uk
    • The KIPS Transactions:PartD
    • /
    • v.9D no.4
    • /
    • pp.555-560
    • /
    • 2002
  • Subsequence matching, which consists of index searching and post-processing steps, is an operation that finds those subsequences whose changing patterns are similar to that of a given query sequence from a time-series database. This paper discusses optimization of post-processing for subsequence matching. The common problem occurred in post-processing of previous methods is to compare the candidate subsequence with the query sequence for discarding false alarms whenever each candidate subsequence appears during index searching. This makes a sequence containing candidate subsequences to be accessed multiple times from disk, and also have a candidate subsequence to be compared with the query sequence multiple times. These redundancies cause the performance of subsequence matching to degrade seriously. In this paper, we propose a new optimal method for resolving the problem. The proposed method stores ail the candidate subsequences returned by index searching into a binary search tree, and performs post-processing in a batch fashion after finishing the index searching. By this method, we are able to completely eliminate the redundancies mentioned above. For verifying the performance improvement effect of the proposed method, we perform extensive experiments using a real-life stock data set. The results reveal that the proposed method achieves 55 times to 156 times speedup over the previous methods.

Selectivity Estimation using the Generalized Cumulative Density Histogram (일반화된 누적밀도 히스토그램을 이용한 공간 선택율 추정)

  • Chi, Jeong-Hee;Kim, Sang-Ho;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.11D no.4
    • /
    • pp.983-990
    • /
    • 2004
  • Multiple-count problem is occurred when rectangle objects span across several buckets. The CD histogram is a technique which selves this problem by keeping four sub-histograms corresponding to the four points of rectangle. Although It provides exact results with constant response time, there is still a considerable issue. Since it is based on a query window which aligns with a given grid, a number of errors nay be occurred when it is applied to real applications. In this paper, we propose selectivity estimation techniques using the generalized cumulative density histogram based on two probabilistic models : \circled1 probabilistic model which considers the query window area ratio, \circled2 probabilistic model which considers intersection area between a given grid and objects. Our method has the capability of eliminating an impact of the restriction on query window which the existing cumulative density histogram has. We experimented with real datasets to evaluate the proposed methods. Experimental results show that the proposed technique is superior to the existing selectivity estimation techniques. Furthermore, selectivity estimation technique based on probabilistic model considering the intersection area is very accurate(less than 5% errors) at 20% query window. The proposed techniques can be used to accurately quantify the selectivity of the spatial range query on rectangle objects.

Enabling Efficient Verification of Dynamic Data Possession and Batch Updating in Cloud Storage

  • Qi, Yining;Tang, Xin;Huang, Yongfeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2429-2449
    • /
    • 2018
  • Dynamic data possession verification is a common requirement in cloud storage systems. After the client outsources its data to the cloud, it needs to not only check the integrity of its data but also verify whether the update is executed correctly. Previous researches have proposed various schemes based on Merkle Hash Tree (MHT) and implemented some initial improvements to prevent the tree imbalance. This paper tries to take one step further: Is there still any problems remained for optimization? In this paper, we study how to raise the efficiency of data dynamics by improving the parts of query and rebalancing, using a new data structure called Rank-Based Merkle AVL Tree (RB-MAT). Furthermore, we fill the gap of verifying multiple update operations at the same time, which is the novel batch updating scheme. The experimental results show that our efficient scheme has better efficiency than those of existing methods.

Read-only Transaction Processing in Wireless Data Broadcast Environments (무선 데이타 방송 환경에서 읽기-전용 트랜잭션 처리 기법)

  • Lee, Sang-Geun;Kim, Seong-Seok;Hwang, Jong-Seon
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.404-415
    • /
    • 2002
  • In this paper, we address the issue of ensuring consistency of multiple data items requested in a certain order by read-only transactions in a wireless data broadcast environment. To handle the inherent property in a data broadcast environment that data can only be accessed strictly sequential by users, we explore a predeclaration-based query optimization and devise two practical transaction processing methods in the context of local caching. We also evaluate the performance of the proposed methods by an analytical study Evaluation results show that the predeclaration technique we introduce reduces response time significantly and adapts to dynamic changes in workload.

An Improved Algorithm for Building Multi-dimensional Histograms with Overlapped Buckets (중첩된 버킷을 사용하는 다차원 히스토그램에 대한 개선된 알고리즘)

  • 문진영;심규석
    • Journal of KIISE:Databases
    • /
    • v.30 no.3
    • /
    • pp.336-349
    • /
    • 2003
  • Histograms have been getting a lot of attention recently. Histograms are commonly utilized in commercial database systems to capture attribute value distributions for query optimization Recently, in the advent of researches on approximate query answering and stream data, the interests in histograms are widely being spread. The simplest approach assumes that the attributes in relational tables are independent by AVI(Attribute Value Independence) assumption. However, this assumption is not generally valid for real-life datasets. To alleviate the problem of approximation on multi-dimensional data with multiple one-dimensional histograms, several techniques such as wavelet, random sampling and multi-dimensional histograms are proposed. Among them, GENHIST is a multi-dimensional histogram that is designed to approximate the data distribution with real attributes. It uses overlapping buckets that allow more efficient approximation on the data distribution. In this paper, we propose a scheme, OPT that can determine the optimal frequencies of overlapped buckets that minimize the SSE(Sum Squared Error). A histogram with overlapping buckets is first generated by GENHIST and OPT can improve the histogram by calculating the optimal frequency for each bucket. Our experimental result confirms that our technique can improve the accuracy of histograms generated by GENHIST significantly.