An Efficient Compression Method for Multi-dimensional Index Structures

다차원 색인 구조를 위한 효율적인 압축 방법

  • 조형주 (한국과학기술원 전자전산학과) ;
  • 정진완 (한국과학기술원 전자전산학과)
  • Published : 2003.10.01

Abstract

Over the last decades, improvements in CPU speed have greatly exceeded those in memory and disk speeds by orders of magnitude and this enabled the use of compression techniques to reduce the database size as well as the query cost. Although compression techniques are employed in various database researches, there is little work on compressing multi-dimensional index structures. In this paper, we propose an efficient compression method called the hybrid encoding method (HEM) that is tailored to multi-dimensional indexing structures. The HEM compression significantly reduces the query cost and the size of multi-dimensional index structures. Through mathematical analyses and extensive experiments, we show that the HEM compression outperforms an existing method in terms of the index size and the query cost.

지난 십년 동안, CPU의 발전 속도는 메모리나 디스크의 발전 속도를 훨씬 능가하였다. 이것이 압축 방법을 사용하여 데이타베이스 크기를 줄이거나 질의 비용을 줄일 수 있게 만들었다. 다양한 데이타베이스 연구 분야에서 압축 방법이 사용되고 있지만, 다차원 색인 구조를 압축하는 연구는 거의 없다. 본 논문에서는 다차원 색인 구조를 위한 HEM(Hybrid Encoding Method)이라는 압축 방법을 제안한다. HEM 압축 방법은 다차원 색인 구조의 크기뿐만 아니라, 질의 비용도 크게 줄일 수 있다. 수학적인 분석과 다양한 실험을 통하여, 우리는 HEM 압축 방법이 기존에 제안되었던 압축 방법보다 색인 크기와 질의 비용 측면에서 우수하다는 것을 보여준다.

Keywords

References

  1. B. R. Iyer and D. Wilhite: Data Compression Support in Databases. VLDB, 1994
  2. W. K. Ng and C. V. Ravishankar: Relational Database Compression Using Augmented Vector Quantization. ICDE, 1995 https://doi.org/10.1109/ICDE.1995.380352
  3. J. Goldstein, R. Ramakrishnan and U. Shaft, 'Compressing Relations and Indexes,' Proc. the Fourteenth International Conference on Data Engineering, pp.370-379, 1998 https://doi.org/10.1109/ICDE.1998.655800
  4. P. O'Neil and D.Quass, 'Improved Query Performance with Variant Indexes.' In Proceeding of the ACM SIGMOD International Conference on Management of Data, 1997 https://doi.org/10.1145/253260.253268
  5. M. A. Roth and S. J. Van Horn: Database Compression. SIGMOD Record 22(3), 1993 https://doi.org/10.1145/163090.163096
  6. T. Westmann, D. Kossmann, S. Helmer, and G. Moerkotte: The Implementation and Performance of Compressed Databases. SIGMOD Record 29(3), 2000 https://doi.org/10.1145/362084.362137
  7. N. Beckmann et al., 'The $R^{*}$-tree : An Efficient and Robust Access Method for Points and Rectangles,' In Proc. Int'l, Conf. on Management of Data, ACM SIGMOD, pp.322-331, May, 1990 https://doi.org/10.1145/93597.98741
  8. S. Berchtold, D. A. Keim, and H. Kriegel: The X-tree: An Index Structure for High-Dimensional Data. VLDB, 1996
  9. V. Gaede, O. Gunther, 'Multidimensional Access Methods,' ACM Computing Surveys, 30(2), pp.170-231, 1998 https://doi.org/10.1145/280277.280279
  10. D. Comer, 'The ubiquitous B-trees,' Computing Surveys 11, pp. 121-137, 1979 https://doi.org/10.1145/356770.356776
  11. B. Seeger and H. Kriegel: The Buddy-Tree: An Efficient and Robust Access Method for Spatial Data Base Systems, VLDB, 1990
  12. H. Samet, 'The Quadtree and Related Hierarchical Data Structure,' ACM Computing Surveys, 16(2), pp.187-260, 1984 https://doi.org/10.1145/356924.356930
  13. Y. Theodoridis and T. K. Sellis: A Model for the Prediction of $R^{\ast}$-tree Performance. ACM PODS, 1996 https://doi.org/10.1145/237661.237705
  14. I. Kamel and C. Faloutsos: On Packing $R^{\ast}$-trees. CIKM, 1993 https://doi.org/10.1145/170088.170403
  15. Z. Chen and P. Seshadri: An Algebraic Compression Framework for Query Results. ICDE, 2000 https://doi.org/10.1109/ICDE.2000.839404
  16. J. Goldstein, and R. Ramakrishnan: Squeezing the Most out of Relational Database Systems. OCDE, 2000
  17. http://dias.cti.gr/-ytheod/research/datasets/spatial.html