• Title/Summary/Keyword: Join

Search Result 1,153, Processing Time 0.03 seconds

Optimizing Multi-way Join Query Over Data Streams (데이타 스트림에서의 다중 조인 질의 최적화 방법)

  • Park, Hong-Kyu;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.35 no.6
    • /
    • pp.459-468
    • /
    • 2008
  • A data stream which is a massive unbounded sequence of data elements continuously generated at a rapid rate. Many recent research activities for emerging applications often need to deal with the data stream. Such applications can be web click monitoring, sensor data processing, network traffic analysis. telephone records and multi-media data. For this. data processing over a data stream are not performed on the stored data but performed the newly updated data with pre-registered queries, and then return a result immediately or periodically. Recently, many studies are focused on dealing with a data stream more than a stored data set. Especially. there are many researches to optimize continuous queries in order to perform them efficiently. This paper proposes a query optimization algorithm to manage continuous query which has multiple join operators(Multi-way join) over data streams. It is called by an Extended Greedy query optimization based on a greedy algorithm. It defines a join cost by a required operation to compute a join and an operation to process a result and then stores all information for computing join cost and join cost in the statistics catalog. To overcome a weak point of greedy algorithm which has poor performance, the algorithm selects the set of operators with a small lay, instead of operator with the smallest cost. The set is influenced the accuracy and execution time of the algorithm and can be controlled adaptively by two user-defined values. Experiment results illustrate the performance of the EGA algorithm in various stream environments.

A Study on the Efficiency of Join Operation On Stream Data Using Sliding Windows (스트림 데이터에서 슬라이딩 윈도우를 사용한 조인 연산의 효율에 관한 연구)

  • Yang, Young-Hyoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.2
    • /
    • pp.149-157
    • /
    • 2012
  • In this thesis, the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join state. One approximation scenario is to provide a maximum subset of the result, with the objective of losing as few result tuples as possible. An alternative scenario is to provide a random sample of the join result, e.g., if the output of the join is being aggregated. It is shown formally that neither approximation can be addressed effectively for a sliding-window join of arbitrary input streams. Previous work has addressed only the maximum-subset problem, and has implicitly used a frequency based model of stream arrival. There exists a sampling problem for this model. More importantly, it is shown that a broad class of applications for which an age-based model of stream arrival is more appropriate, and both approximation scenarios under this new model are addressed. Finally, for the case of multiple joins being executed with an overall memory constraint, an algorithm for memory allocation across the join that optimizes a combined measure of approximation in all scenarios considered is provided.

Adaptive Row Major Order: a Performance Optimization Method of the Transform-space View Join (적응형 행 기준 순서: 변환공간 뷰 조인의 성능 최적화 방법)

  • Lee Min-Jae;Han Wook-Shin;Whang Kyu-Young
    • Journal of KIISE:Databases
    • /
    • v.32 no.4
    • /
    • pp.345-361
    • /
    • 2005
  • A transform-space index indexes objects represented as points in the transform space An advantage of a transform-space index is that optimization of join algorithms using these indexes becomes relatively simple. However, the disadvantage is that these algorithms cannot be applied to original-space indexes such as the R-tree. As a way of overcoming this disadvantages, the authors earlier proposed the transform-space view join algorithm that joins two original- space indexes in the transform space through the notion of the transform-space view. A transform-space view is a virtual transform-space index that allows us to perform join in the transform space using original-space indexes. In a transform-space view join algorithm, the order of accessing disk pages -for which various space filling curves could be used -makes a significant impact on the performance of joins. In this paper, we Propose a new space filling curve called the adaptive row major order (ARM order). The ARM order adaptively controls the order of accessing pages and significantly reduces the one-pass buffer size (the minimum buffer size required for guaranteeing one disk access per page) and the number of disk accesses for a given buffer size. Through analysis and experiments, we verify the excellence of the ARM order when used with the transform-space view join. The transform-space view join with the ARM order always outperforms existing ones in terms of both measures used: the one-pass buffer size and the number of disk accesses for a given buffer size. Compared to other conventional space filling curves used with the transform-space view join, it reduces the one-pass buffer size by up to 21.3 times and the number of disk accesses by up to $74.6\%$. In addition, compared to existing spatial join algorithms that use R-trees in the original space, it reduces the one-pass buffer size by up to 15.7 times and the number of disk accesses by up to $65.3\%$.

Efficient Parallel Spatial Join Processing Method in a Shared-Nothing Database Cluster System (비공유 공간 클러스터 환경에서 효율적인 병렬 공간 조인 처리 기법)

  • Chung, Warn-Ill;Lee, Chung-Ho;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.10D no.4
    • /
    • pp.591-602
    • /
    • 2003
  • Delay and discontinuance phenomenon of service are cause by sudden increase of the network communication amount and the quantity consumed of resources when Internet users are driven excessively to a conventional single large database sewer. To solve these problems, spatial database cluster consisted of several single nodes on high-speed network to offer high-performance is risen. But, research about spatial join operation that can reduce the performance of whole system in case process at single node is not achieved. So, in this paper, we propose efficient parallel spatial join processing method in a spatial database cluster system that uses data partitions and replications method that considers the characteristics of space data. Since proposed method does not need the creation step and the assignment step of tasks, and does not occur additional message transmission between cluster nodes that appear in existent parallel spatial join method, it shows performance improvement of 23% than the conventional parallel R-tree spatial join for a shared-nothing architecture about expensive spatial join queries. Also, It can minimize the response time to user because it removes redundant refinement operation at each cluster node.

Performance Comparison of Column-Oriented and Row-Oriented Database Systems for Star Schema Join Processing (스타 스키마 조인 처리에 대한 세로-지향 데이터베이스 시스템과 가로-지향 데이터베이스 시스템의 성능 비교)

  • Oh, Byung-Jung;Ahn, Soo-Min;Kim, Kyung-Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.8
    • /
    • pp.29-38
    • /
    • 2011
  • Unlike in traditional row-oriented database systems, a column-oriented database system stores data in column-oriented and not row-oriented order. Recently, research results revealed the effectiveness of column-oriented databases for applications such as data warehouse and decision support systems that access large volumes of data in a read only manner. In this paper, we investigate the join strategies for column-oriented databases and prove the effectiveness of column-oriented databases in data warehouse systems. For unbiased comparison, the two database systems are analyzed using the star schema benchmark and the performance analysis of a star schema join query is carried out. We experimented with well-known join algorithms and considered early materialization and late materialization join strategies for column-oriented databases. The performance results confirm that star schema join queries perform better in terms of disk I/O cost in column-oriented databases than in row-oriented databases. In addition, the late materialization strategy showed more performance gain than the early materialization strategy in column-oriented databases.

Spatial Join based on the Transform-Space View (변환공간 뷰를 기반으로한 공간 조인)

  • 이민재;한욱신;황규영
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.438-450
    • /
    • 2003
  • Spatial joins find pairs of objects that overlap with each other. In spatial joins using indexes, original-space indexes such as the R-tree are widely used. An original-space index is the one that indexes objects as represented in the original space. Since original-space indexes deal with sizes of objects, it is difficult to develop a formal algorithm without relying on heuristics. On the other hand, transform-space indexes, which transform objects in the original space into points in the transform space and index them, deal only with points but no sites. Thus, spatial join algorithms using these indexes are relatively simple and can be formally developed. However, the disadvantage of transform-space join algorithms is that they cannot be applied to original-space indexes such as the R-tree containing original-space objects. In this paper, we present a novel mechanism for achieving the best of these two types of algorithms. Specifically, we propose a new notion of the transform-space view and present the transform-space view join algorithm(TSVJ). A transform-space view is a virtual transform-space index based on an original-space index. It allows us to interpret on-the-fly a pre-built original-space index as a transform-space index without incurring any overhead and without actually modifying the structure of the original-space index or changing object representation. The experimental result shows that, compared to existing spatial join algorithms that use R-trees in the original space, the TSVJ improves the number of disk accesses by up to 43.1% The most important contribution of this paper is to show that we can use original-space indexes, such as the R-tree, in the transform space by interpreting them through the notion of the transform-space view. We believe that this new notion provides a framework for developing various new spatial query processing algorithms in the transform space.

Conjunctive Boolean Query Optimization based on Join Sequence Separability in Information Retrieval Systems (정보검색시스템에서 조인 시퀀스 분리성 기반 논리곱 불리언 질의 최적화)

  • 박병권;한욱신;황규영
    • Journal of KIISE:Databases
    • /
    • v.31 no.4
    • /
    • pp.395-408
    • /
    • 2004
  • A conjunctive Boolean text query refers to a query that searches for tort documents containing all of the specified keywords, and is the most frequently used query form in information retrieval systems. Typically, the query specifies a long list of keywords for better precision, and in this case, the order of keyword processing has a significant impact on the query speed. Currently known approaches to this ordering are based on heuristics and, therefore, cannot guarantee an optimal ordering. We can use a systematic approach by leveraging a database query processing algorithm like the dynamic programming, but it is not suitable for a text query with a typically long list of keywords because of the algorithm's exponential run-time (Ο(n2$^{n-1}$)) for n keywords. Considering these problems, we propose a new approach based on a property called the join sequence separability. This property states that the optimal join sequence is separable into two subsequences of different join methods under a certain condition on the joined relations, and this property enables us to find a globally optimal join sequence in Ο(n2$^{n-1}$). In this paper we describe the property formally, present an optimization algorithm based on the property, prove that the algorithm finds an optimal join sequence, and validate our approach through simulation using an analytic cost model. Comparison with the heuristic text query optimization approaches shows a maximum of 100 times faster query processing, and comparison with the dynamic programming approach shows exponentially faster query optimization (e.g., 600 times for a 10-keyword query).

Effects of A Qigong Training Program on the Anxiety and Labor Pain of Primipara (기공체조프로그램이 초산부의 불안 및 분만통증 완화에 미치는 효과)

  • Jeong, Soon-Ok;Kho, Hyo-Jung;Lee, Eun-Ju
    • Women's Health Nursing
    • /
    • v.12 no.2
    • /
    • pp.97-105
    • /
    • 2006
  • Purpose: The purpose of this study is to verify the effects of the Qigong training program on the anxiety and labor pains of primipara. Method: The research subjects were a total of 60 primipara who consulted a doctor regularly concerning their antenatal care. Among them, 30 people were the experimental group, and the other 30 people were the control group, and were selected as homogeneous with the experimental group. The degree of anxiety and labor pains were measured by State-Trait Anxiety Inventory(STAI) and Graphic Rating Scale(GRS). SPSS WIN 11.0 was used for data analysis. Obstetric and general characteristics between experimental and control groups, and a homogeneity test of state and trait anxiety were done by both $X^2$ test and t-test. The hypothesis testing was analyzed by ANCOVA with a covariate of pretest value. Result: The first hypothesis, 'Primipara who join the Qigong training program have lower anxiety than those who do not join' was supported (F=28.8, p<.000). The second hypothesis, 'Primipara who join the Qigong training program have lower labor pain than those who do not join' was unsupported. Conclusion: It was verified that the Qigong training program was effective in alleviating anxiety; however it did not have any effect on relieving labor pain, so more in-depth research is needed later on.

  • PDF