Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2005.12D.2.179

Skewed Data Handling Technique Using an Enhanced Spatial Hash Join Algorithm  

Shim Young-Bok (충북대학교 대학원 컴퓨터교육과)
Lee Jong-Yun (충북대학교 컴퓨터교육과)
Abstract
Much research for spatial join has been extensively studied over the last decade. In this paper, we focus on the filtering step of candidate objects for spatial join operations on the input tables that none of the inputs is indexed. In this case, many algorithms has presented and showed excellent performance over most spatial data. However, if data sets of input table for the spatial join ale skewed, the join performance is dramatically degraded. Also, little research on solving the problem in the presence of skewed data has been attempted. Therefore, we propose a spatial hash strip join (SHSJ) algorithm that combines properties of the existing spatial hash join (SHJ) algorithm based on spatial partition for input data set's distribution and SSSJ algorithm. Finally, in order to show SHSJ the outperform in uniform/skew cases, we experiment SHSJ using the Tiger/line data sets and compare it with the SHJ algorithm.
Keywords
공간 데이터베이스;공간 조인;질의 처리;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. T. Leutenegger, J. Edgington, and M. A. Lopez, 'STR: A Simple and Efficient Algorithm for R-Tree Packing,' In Proceedings of International Conference on Data Engineering, pp.497-506, Apr., 1997   DOI
2 R. Elmasri and S. B. Navathe, Fundamental of Database systems, 3rd edition, Addison-Wesley Publishers, pp.594-600, 2000
3 M. L. Lo and C. V. Ravishankar, 'Spatial joins using seeded trees,' In Proceedings of ACM SIGMOD International Conference on Management of Data, Minneapolis, MN, pp. 209-220, May, 1994   DOI
4 M. L. Lo and C. V. Ravishankar, 'Generating seeded trees from data sets,' In the Fourth International Symposium on Large Spatial Databases (Advances in Spatial Databases:SSD '95), Portland, Maine, pp.328-347, Aug., 1995
5 N. Mamoulis and D. Papadias, 'Slot Index Spatial Join' IEEE Transactions on Knowledge and Data Engineering, Vol.15, No.1, Jan/Feb., 2003   DOI   ScienceOn
6 M. L. Lo and C. V. Ravishankar, 'The Design and Implementation of Seeded Trees: An Efficient Method for Spatial Joins,' IEEE Transactions on Knowledge and Data Engineering, Vo1.10, No.1, pp.136-151, 1998   DOI   ScienceOn
7 J. M. Patel and D. J. DeWitt, 'Partition Based Spatial-Merge Join,' In Proceedings of ACM SIGMOD International Conference on Management of Data, pp.259-270, Jun., 1996   DOI
8 N. Koudas and K. Sevcik, 'Size Separation Spatial Join,' In Proceedings of ACM SIGMOD International Conference Management of Data, pp.324-335, May, 1997   DOI
9 R. H. Buting and W. Schilling, 'A Practical Divide-and-Conquer Algorithm for the Rectangle Intersection Problem,' Information Sciences, Vol.42, No.2, pp.95-112, July, 1987   DOI   ScienceOn
10 L. Arge, O. Procopiuc, S. Ramaswami, T. Suel, and J Vitter, 'Scalable Sweeping Based Spatial Join,' In Proceedings of International Conference on Very Large Data Bases, pp.570-581, Aug., 1998
11 U. S, Bureau of the Census, '2002 Tiger/line Files,' 2002
12 A. Guttman, 'R-Trees: A Dynamic Index Structure for Spatial Searching,' In Proceedings of ACM SIGMOD International Conference on Management of Data, pp.47-57, Jun., 1984   DOI
13 L. Becker, K. Hinrichs, and U. Finke, 'A New Algorithm for Computing of Spatial Joins Using R-trees,' In Proceedings of the Ninth International Conference on Data Engineering, pp.190-197, Vienna, Austria, Apr., 1993
14 T. Brinkhoff, H. Kriegel, R. Schneider, and B. Seeger, 'Multi-Step Processing of Spatial Joins,' In Proceedings of ACM SIGMOD International Conference on Management of Data, pp.197-208, Jun., 1994   DOI
15 M. L. La and C. V. Ravishankar, 'Spatial Hash-Joins,' In Proceedings of ACM SIGMOD International Conference on Management of Data, pp.209-220, May, 1996
16 J.A. Orenstein, 'Redundancy in Spatial Databases,' In Proceeding of ACM SIGMOD International Conference on Management of Data, pp.294-305, June, 1989   DOI