Efficient Storage Techniques for Multidimensional Index Structures in Multi-Zoned Disk Environments

다중 존 디스크 환경에서 다차원 인덱스 구조의 효율적 저장 기법

  • Published : 2007.08.15

Abstract

The performance of database applications with large sets of multidimensional data depends on the performance of its access methods and the underlying disk system. In modeling the disk system, even though modem disks are manufactured with multiple physical zones, conventional access methods have been developed based on a traditional disk model with many simplifying assumptions. Thus, there is a marked lack of investigation on how to enhance the performance of access methods given a zoned disk model. The paper proposes novel zoning techniques that can be applied to any multidimensional access methods, both static and dynamic, enhancing the effective data transfer rate of underlying disk system by fully utilizing its zone characteristics. Our zoning techniques include data placement algorithms for multidimensional index structures and accompanying localized query processing algorithms for range queries. The experimental results show that our zoning techniques significantly improve the query performance.

대용량의 다차원 데이타를 다루는 데이타베이스 응용분야에서는 접근 방법 및 기반 디스크 시스템이 전반적인 성능에 중요한 영향을 미친다. 현재 생산되고 있는 많은 디스크들은 다중의 물리적 존을 갖도록 설계되고 있다. 그러나 기존의 접근 방법에 대한 연구는 단순한 가정의 전통적인 디스크 모델에 기반을 두고 진행되어 왔고, 다중 존 디스크를 고려한 접근 방법에 대한 연구는 현재까지 거의 이루어지지 않고 있다. 본 논문에서는 다중 존 디스크 환경에서 실질적인 데이타 전송률을 향상시키기 위해, 정적 및 동적 환경 모두를 고려한 다차원 인덱스 구조의 디스크 저장 기법을 제안한다. 이를 위해 다차원 인덱스 구조를 다중 존 디스크에 효과적으로 배치하는 알고리즘을 제시하고, 범위 질의에 대해 지역화된 질의 처리 기법을 제안한다. 또한 실험을 통하여 본 논문이 제안한 기술이 질의 성능을 획기적으로 향상시킨다는 것을 증명한다.

Keywords

References

  1. Ng, S.W., Advances in Disk Technology: Performance Issues. IEEE Computer Magazine, pages 75-81, 1998
  2. Ruemmler, C. and Wilkes, J. An Introduction to Disk Drive Modeling, IEEE Computer, March 1994
  3. Leutenegger, S.T. and Lopez, M.A., The Effect of Buffering on the Performance of R-trees. IEEE Transactions on Knowledge and Data Engineering, 12(1):33-44, 2000 https://doi.org/10.1109/69.842248
  4. Beckman, N., et. Al., The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. In ACM SIGMOD International Conference on Management of Data, pages 322-331, 1990
  5. Yu, B. and Kim, S.-H. Zoning Multidimensional Access Methods for Analytical Database Applications, Proc. the 3rd International Conference on Computer Science and its Applications, pp. 191-196, 2005
  6. Yu, B. and Kim, S.-H. An Efficient Zoning Technique for Multidimensional Access Methods, Proc. the VLDB Workshop on Trends in Enterprise Application Architecture, LNCS 3888, pp. 129-143, 2005
  7. Faloutsos, C. and Kamel, I. On Packing R-tree. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), pp. 490-499, 1993
  8. Leutenegger, S.T. and Lopez, M.A. and Edingnton, J.M., STR: A Simple and Efficient Algorithm for R-tree Packing. IEEE International Conference on Data Engineering, pages 497-506, 1997
  9. Rosenberg, A.L. and Snyder, L., Time- and Space- Optimality in B-trees, ACM Transactions on Database Systems, 6(1):174-193, 1981 https://doi.org/10.1145/319540.319565
  10. Roussopoulos, N. and Leifker, D., Direct Spatial Search on Pictorial Database Using Packed Rtrees, ACM International Conference on Management of Data, pages 17-31, 1985
  11. Orlandic, R. and Yu, B., Scalable QSF-Trees: Retrieving Regional Objects in High-Dimensional Spaces, JDM Journal of Database Management, Idea Group Publishing, 15(3):45-59, 2004
  12. Papadias, D. Theodoridis, Y., Sellis, T., & Egenhofer, M.J. Topological relations in the world of minimum bounding rectangles: A study with Rtrees. Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 92-103, 1995
  13. Orlandic, R., & Yu, B. A Retrieval Technique for High-Dimensional Data and Partially Specified Queries. DKE Data & Knowledge Engineering, Elsevier 42(2), pp. 1-21, 2002 https://doi.org/10.1016/S0169-023X(02)00023-X
  14. Robinson, J.T. The K-D-B Tree: A Search Structure for Large Multidimensional Dynamic Indexes. Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 10-18, 1981
  15. Berchtold, S. Keim, D., & Kriegel, H.-P. The X-tree: An index structure for high-dimensional data. Proc. VLDB Int. Conf. on Very Large Data Bases, pp. 28-39, 1996
  16. Guttman, A. R-trees: A dynamic index structure for spatial searching. Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 47-54, 1984
  17. Comer, D., The Ubiquitous B-tree. ACM Computing Surveys 11, pp. 121-137, 1979 https://doi.org/10.1145/356770.356776
  18. White, D.A. and Jain, R. Similarity Indexing with the SS-tree. Proc. 12th IEEE Conf. on Data Engineering, pp. 516-523, 1996
  19. Lin, K., Jagadish, H., and Faloutsos, C. The TVtree: An Index Structure for High-Dimensional Data. VLDB Journal. Vol. 3, pp. 517-542, 1995 https://doi.org/10.1007/BF01231606
  20. Berchtold, S., Bohm, C., & Kriegel, H.-P. The Pyramid-technique: Towards breaking the curse of dimensionality. Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 142-153, 1998
  21. Cheng, R., Kalashnikov, D., and Prabhakar, S. Evaluating Probabilistic Queries over Imprecise Data. Proc. ACM SIGMOD International Conference on Management of Data, pp. 551-562, 2003
  22. Jun, B., Hong, B., and Yu, B. Dynamic Splitting Policies of the Adaptive 3DR-tree for Indexing Continuously Moving Objects. Proc. DEXA International Conference on Database and Expert Systems Applications, LNCS Lecture Notes in Computer Science, Vol. 2736, pp. 308-317, Springer-Verlag, Berlin Hidelberg, 2003
  23. Papadias, D., Tao, Y., and Sun, J. The TPR*-tree: An optimized spatio-temporal access method for predictive queries. Proc. the VLDB International Conference on Very Large Databases, pp. 790-801, 2003
  24. Pfoser, D., Jensen, C.S., and Theodoridis, Y. Novel Approaches to the Indexing of Moving Object Trajectories. Proc. VLDB Very Large Data Base Conference, pp. 395-406, 2000
  25. Saltenis, S., Jensen, C.S., Leutnegger, S.T., and Lopez, M.A. Indexing the positions of continuously moving objects. Proc. the ACM SIGMOD International Conference on Management of Data, pp. 331-342, 2000