• Title/Summary/Keyword: Join query

Search Result 116, Processing Time 0.03 seconds

Preprocessing Method for Handling Multi-Way Join Continuous Queries over Data Streams (데이터 스트림에서 다중 조인 연속질의의 효과적인 처리를 위한 전처리 기법)

  • Seo, Ki-Yeon;Lee, Joo-Il;Lee, Won-Suk
    • Journal of Internet Computing and Services
    • /
    • v.13 no.3
    • /
    • pp.93-105
    • /
    • 2012
  • A data stream is a series of tuples which are generated in real-time, incessant, immense, and volatile manner. As new information technologies are actively emerging, stream processing methods are being needed to efficiently handle data streams. Especially, finding out an efficient evaluation for a multi-way join would make outstanding contributions toward improving the performance of a data stream management system because a join operation is one of the most resource-consuming operators for evaluating queries. In this paper, in order to evaluate efficiently a multi-way join continuous query, we propose a novel method to decrease the cost of a query by eliminating unsuccessful intermediate results. For this, we propose a matrix-based structure for monitoring data streams and estimate the number of final result tuples of the query and find out unsuccessful tuples by matrix multiplication operations. And then using these information, we process efficiently a multi-way join continuous query by filtering out the unsuccessful tuples in advance before actual evaluation of the query.

An Energy-Efficient In-Network Join Query Processing using Synopsis and Encoding in Sensor Network (센서 네트워크에서 시놉시스와 인코딩을 이용한 에너지 효율적인 인-네트워크 조인 질의 처리)

  • Yeo, Myung-Ho;Jang, Yong-Jin;Kim, Hyun-Ju;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.2
    • /
    • pp.126-134
    • /
    • 2011
  • Recently, many researchers are interested in using join queries to correlate sensor readings stored in different regions. In the conventional algorithm, the preliminary join coordinator collects the synopsis from sensor nodes and determines a set of sensor readings that are required for processing the join query. Then, the base station collects only a part of sensor readings instead of whole readings and performs the final join process. However, it has a problem that incurs communication overhead for processing the preliminary join. In this paper, we propose a novel energy-efficient in-network join scheme that solves such a problem. The proposed scheme determines a preliminary join coordinator located to minimize the communication cost for the preliminary join. The coordinator prunes data that do not contribute to the join result and performs the compression of sensor readings in the early stage of the join processing. Therefore, the base station just collects a part of compressed sensor readings with the decompression table and determines the join result from them. In the result, the proposed scheme reduces communication costs for the preliminary join processing and prolongs the network lifetime.

Grid-based Index Generation and k-nearest-neighbor Join Query-processing Algorithm using MapReduce (맵리듀스를 이용한 그리드 기반 인덱스 생성 및 k-NN 조인 질의 처리 알고리즘)

  • Jang, Miyoung;Chang, Jae Woo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1303-1313
    • /
    • 2015
  • MapReduce provides high levels of system scalability and fault tolerance for large-size data processing. A MapReduce-based k-nearest-neighbor(k-NN) join algorithm seeks to produce the k nearest-neighbors of each point of a dataset from another dataset. The algorithm has been considered important in bigdata analysis. However, the existing k-NN join query-processing algorithm suffers from a high index-construction cost that makes it unsuitable for the processing of bigdata. To solve the corresponding problems, we propose a new grid-based, k-NN join query-processing algorithm. Our algorithm retrieves only the neighboring data from a query cell and sends them to each MapReduce task, making it possible to improve the overhead data transmission and computation. Our performance analysis shows that our algorithm outperforms the existing scheme by up to seven-fold in terms of the query-processing time, while also achieving high extent of query-result accuracy.

Efficient Structural Join Technique using the Level Information of Indexed XML Documents (색인된 XML 문서에서 레벨 정보를 이용한 효과적인 구조 조인 기법)

  • Lee Yunho;Choi Ilhwan;Kim Jongik;Kim Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.641-649
    • /
    • 2005
  • As XML is widely used with the development of internet, many researches on the XML storage and query processing have been done Several index techniques have been proposed to efficiently process XML path queries. Recently, structural join has received murk attention as a method to protest the path query. Structural join technique process a path query by identifying the containment relationship of elements. Especially, it has an advantage that we can get the result set by simply comparing related elements only instead of scanning whole document. However during the comparison process, unnecessary elements that are not included in the result set can be scanned. So we propose a new technique, the level structural join. In this technique, we use both the relationship and the level distribution of elements in the path query. Using this technique, we tao improve the performance of query processing only by comparing elements with specific level in the target inverted level.

Energy Join Quality Aware Real-time Query Scheduling Algorithm for Wireless Sensor Networks

  • Phuong, Luong Thi Thu;Lee, Sung-Young;Lee, Young-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.92-96
    • /
    • 2011
  • Nowadays, the researches study high rate and real-time query applications seem to be real-time query scheduling protocols and energy aware real time query protocols. Also the WSNs should provide the quality of data in real time query applications that is more and more popular for wireless sensor networks (WSNs). Thus we propose the quality of data function to merge into energy efficiency called energy join quality aware realtime query scheduling (EJQRTQ). Our work calculate the energy ratio that considers interference of queries, and then compute the expected quality of query and allocate slots to real-time preemptive query scheduler.

Continuous Spatio-Temporal Self-Join Queries over Stream Data of Moving Objects for Symbolic Space (기호공간에서 이동객체 스트림 데이터의 연속 시공간 셀프조인 질의)

  • Hwang, Byung-Ju;Li, Ki-Joune
    • Spatial Information Research
    • /
    • v.18 no.1
    • /
    • pp.77-87
    • /
    • 2010
  • Spatio-temporal join operators are essential to the management of spatio-temporal data such as moving objects. For example, the join operators are parts of processing to analyze movement of objects and search similar patterns of moving objects. Various studies on spatio-temporal join queries in outdoor space have been done. Recently with advance of indoor positioning techniques, location based services are required in indoor space as well as outdoor space. Nevertheless there is no one about processing of spatio-temporal join query in indoor space. In this paper, we introduce continuous spatio-temporal self-join queries in indoor space and propose a method of processing of the join queries over stream data of moving objects. The continuous spatio-temporal self-join query is to update the joined result set satisfying spatio-temporal predicates continuously. We assume that positions of moving objects are represented by symbols such as a room or corridor. This paper proposes a data structure, called Candidate Pairs Buffer, to filter and maintain massive stream data efficiently and we also investigate performance of proposed method in experimental study.

Scope Minimization of Join Queries using a Range Window on Streaming XML Data (스트리밍 XML 데이타에서 영역 윈도우를 사용한 조인 질의의 범위 최소화 기법)

  • Park, Seog;Kim, Mi-Sun
    • Journal of KIISE:Databases
    • /
    • v.33 no.2
    • /
    • pp.224-238
    • /
    • 2006
  • As XML became the standard of data exchange in the internet, the needs for effective query processing for XML data in streaming environment is increasing. Applying the existing database technique which processes data with the unit of tuple to the streaming XML data causes the out-of-memory problem due to limited memory volume. Likewise the cost for searching query path and accessing specific data may be remarkably increased because of special structure of XML. In a word it is unreasonable to apply the existing database system to the streaming environment that processes query for partial data, not the whole one. Thus, it should be able to search partial streaming data that rapidly satisfies join predicate through using low-capacity memory, based on a store technique suitable to streaming XML data. In this thesis, in order to study the store technique for low-capacity memory, the PCDATA and the CDATA-related parts, which can be used as predicate on join query, were fetched and saved. In addition, in an attempt to compare rapid join predicates, the range window of streaming XML data was set with the object of selectively joining windows that satisfies the query, based on Cardinality * and + among the structure information of DTD.

An Efficient XML Query Processing Method using Path Containment Relationships (경로 포함 관계를 이용한 효율적인 XML 질의 처리기법)

  • 민경섭;김형주
    • Journal of KIISE:Databases
    • /
    • v.31 no.2
    • /
    • pp.183-194
    • /
    • 2004
  • As XML is a do facto standard for a data exchange language, there have been several researches on efficient processing XML queries. The most important thing to consider when processing XML queries is how efficiently we can process path expressions in queries. Some previous works make results by performing a sequence of join operations on all records corresponding to labels in the path expression. Others works check the existence of paths in the query using an RDBMS's string comparison operator and make results by extracting the records corresponding to the paths. In this paper we suggested a new query planning algorithm based on path containment relationships and two join operators supporting the planning algorithm. The join operators use only the records related to the paths in a query as input data, scan them only once, and generate result data using a pipelining mechanism. By analysis and experiments, we confirmed that our techniques(a new query planning algorithm and two join operators) achieved significantly higher performance than other previous works.

Optimizing Multi-way Join Query Over Data Streams (데이타 스트림에서의 다중 조인 질의 최적화 방법)

  • Park, Hong-Kyu;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.35 no.6
    • /
    • pp.459-468
    • /
    • 2008
  • A data stream which is a massive unbounded sequence of data elements continuously generated at a rapid rate. Many recent research activities for emerging applications often need to deal with the data stream. Such applications can be web click monitoring, sensor data processing, network traffic analysis. telephone records and multi-media data. For this. data processing over a data stream are not performed on the stored data but performed the newly updated data with pre-registered queries, and then return a result immediately or periodically. Recently, many studies are focused on dealing with a data stream more than a stored data set. Especially. there are many researches to optimize continuous queries in order to perform them efficiently. This paper proposes a query optimization algorithm to manage continuous query which has multiple join operators(Multi-way join) over data streams. It is called by an Extended Greedy query optimization based on a greedy algorithm. It defines a join cost by a required operation to compute a join and an operation to process a result and then stores all information for computing join cost and join cost in the statistics catalog. To overcome a weak point of greedy algorithm which has poor performance, the algorithm selects the set of operators with a small lay, instead of operator with the smallest cost. The set is influenced the accuracy and execution time of the algorithm and can be controlled adaptively by two user-defined values. Experiment results illustrate the performance of the EGA algorithm in various stream environments.

Estimating Join Selectivity of Global XQuery Queries in Distributed Environments (분산 환경에서 전역 XQuery 질의의 조인 선택치 추정 방법)

  • Park, Jong-Hyun;Kang, Ji-Hoon
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.564-571
    • /
    • 2007
  • One of the methods for integrating XML data in distributed environments is using XML view. User can query toward distributed local XML views by using global XQuery queries in XQuery which is a standard query language for searching XML data. The global XQuery queries naturally contain join operations because of integrating and searching distributed heterogeneous data. Since join operations are generally expensive for processing a query, its processing technique is very important for efficient processing of global XQuery queries. Therefore there are some studies on the efficient processing of join operations and one of these studies is that selects minimum join cost by estimating a join selectivity. In case of SQL, there are already some researches for estimating a join selectivity and join cost of global SQL queries. However we can not apply their methods for estimating the selectivity of join operations in SQL queries into XQuery queries because of the structural difference between relational data and XML data. Therefore this paper proposes a method for estimating a selectivity of join operations in XQuery queries using the information of XML views. Our contribution is three threefold. First, we define the difference point for estimating join selectivity between SQL and XQuery. Second, we estimate join selectivity in XQuery queries by referring XML views. Third, we evaluate our estimating method.