Browse > Article

An Advanced Parallel Join Algorithm for Managing Data Skew on Hypercube Systems  

원영선 (충남대학교 정보통신공학부 BK)
홍만표 (아주대학교 정보 및 컴퓨터공학부)
Abstract
In this paper, we propose advanced parallel join algorithm to efficiently process join operation on hypercube systems. This algorithm uses a broadcasting method in processing relation R which is compatible with hypercube structure. Hence, we can present optimized parallel join algorithm for that hypercube structure. The proposed algorithm has a complete solution of two essential problems - load balancing problem and data skew problem - in parallelization of join operation. In order to solve these problems, we made good use of the characteristics of clustering effect in the algorithm. As a result of this, performance is improved on the whole system than existing algorithms. Moreover. new algorithm has an advantage that can implement non-equijoin operation easily which is difficult to be implemented in hash based algorithm. Finally, according to the cost model analysis. this algorithm showed better performance than existing parallel join algorithms.
Keywords
hypercube; parallel join; load balancing; data skew;
Citations & Related Records
연도 인용수 순위
  • Reference
1 H.I.Choi, B.M.Im, M.H.Kim, Y.J.Lee, 'An Efficient Parallel Join Algorithm Based on Hypercube Partitioning', Proceedings of the 3rd Conference on Parallel and Distributed Information Systems, pp50-57, 1994   DOI
2 S. Cho, Y. Wean, M. Hong, 'A Parallel Join Algorithm Using Hyper Quick Sort', Proceedings of the Ninth IASTED International Conference on Parallel and Distributed Computing and Systems, USA, pp97-106, October, 1997
3 Soon M.Chung and Jaerheen Yang, 'A Parallel Distributive Join Algorithm for Cube Connected Multiprocessors', IEEE Transactions on Parallel and Distributed Systems, 7(2), pp127-137, 1996   DOI   ScienceOn
4 D.J.DeWitt, R.H.Katz, F.Olken, L.D. Shapiro, M.R.Stonebraker, D.Wood, 'Implementation Techniques for Main memory database system', Proceeding of SIGMOD Conf., pp1-8, June, 1984   DOI
5 D.J. DeWitt, and R. Gerber, 'Multiprocessor Hash-Based Join Algorithms', Proceedings of the 11th International Conference on Very Large Data Bases, pp151-162, August, 1985
6 M.Negri and G.Pelagatti, 'Distributive join : A new algorithm for joining relation', ACM Transations on Database Systems, 16(4), pp655-669, 1991   DOI
7 Leonard D.Shapiro, 'Join Processing in Database Systems with Large Main Memories', ACM Transactions on Database Systems, 11(3), pp.239-264, September 1986   DOI
8 Edward R.Omiecinski, Eileen Tien Lin, 'The Adaptive-Hash Join Algorithm for A Hypercube Multicomputer', IEEE Transactions on Parallel and Distributed systems, 3(3):334-349, May 1992   DOI   ScienceOn
9 Youngsun Weon, Seokbong Cho, Kyuock Lee, Youngkwon Cha, Man Pyo Hong, 'Performance Analysis of an Advanced Parallel Join Algorithm on Hypercube Systems', Journal of KISS, 26(6), 1999
10 Patrick Valduriez,Georges Gardarin, 'Join and Semijoin Algorithms for a Multiprocessor Database Machine', ACM Transactions on Database Systems, 9(1), pp133-161, March 1984   DOI   ScienceOn
11 Priti Mishra and Margaret H.Eich, 'Join Processing in Relational Databases', ACM Compuing Sunieys, 24(1), pp.63-113, March 1992   DOI
12 Vipin Kumar, Introducing to parallel Computing design and analysis of parallel algorithms, The Benjamin / Cummings Publishing Company Inc., 1994
13 Hui-I Hsiao, Ming -Syan Chen, Philip S. Yu, 'Parallel Execution of Hash Joins in Parallel Databases', IEEE Trans. Parallel and Distributed Systems, 8(8), pp872-883, Aug. 1997   DOI   ScienceOn
14 Donovan A.Schneider, 'A Performance Evaluation of Four Parallel Join Algorithms in a Shared- othing Multiprocessor Environment', Proceeding of the 1989 SIGMOD Conference ACM, pp.110-121, 1989   DOI
15 Soon M. Chung, Arindam Chatterjee, 'Performance Analysis of a Parallel Distributive Join Algorithm on the Intel Paragon', International Conference on Parallel and Distributed Systems, 1997   DOI