Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2003.10D.6.927

Block Histogram Compression Method for Selectivity Estimation in High-dimensions  

Lee, Ju-Hong (인하대학교 컴퓨터공학부)
Jeon, Seok-Ju (안산1대학 인터넷정보과)
Park, Seon (인하대학교 대학원 전자계산공학과)
Abstract
Database query optimates the selectivety of a query to find the most efficient access plan. Multi-dimensional selectivity estimation technique is required for a query with multiple attributes because the attributes are not independent each other. Histogram is practically used in most commercial database products because it approximates data distributions with small overhead and small error rates. However, histogram is inadequate for a query with multiple attributes because it incurs high storage overhead and high error rates. In this paper, we propose a novel method for multi-dimentional selectivity estimation. Compressed information from a large number of small-sized histogram buckets is maintained using the discrete cosine transform. This enables low error rates and low storage overheads even in high dimensions. Extensive experimental results show adventages of the proposed approach.
Keywords
Database Query Optimizer; Black Partitioned Histogram; Multi-dimension Histogram;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. R. Rao, P. Yip, 'Discrete Cosine Transform Algorithms, Advantages, Applications,' Academic Press, 1990
2 Shanmugasundaram, J, Fayyad, U. and Bradley, P. S., 'Compressed Data Cubes for OLAF Aggregate Query Approximation on Continuous Dimensions,' In the procee dings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, San Diego, pp.223-232, 1999
3 W. Sun, Y. Ling, N. Rishe and Y. Deng, 'An Instant and accurate size estimation method for joins and selections in a retrieval-intensive environment,' ACW SIGMOD Conference, 1993   DOI
4 Vitter, J S., Wang, M. and Iyer, B., 'Data Cube Approximate and Histograms via Wavelets,' In Proceedings of seventh International Conference on Information and Knowledge Management, ACM Press, Washington D.C., pp. 96-104, 1998   DOI
5 M. V. Mannino, P. Chu and T. Sager, 'Statistical profile estimation in database systems,' ACM Computing Surveys, Vol.20, No.3, 1988   DOI   ScienceOn
6 V. Poosala, Y. E. Ioannidis, 'Selectivity Estimation Without the Attribute Value Independence Assumption,' 23th VLDB Conference, pp.486-495, 1997
7 Y. Ioannidis, V. Poosala, 'Balancing Optimality and Practicality for Query Result Size Estimation,' ACM SIGMOD Conference, pp.233-244, 1995
8 S. Christodoulakis, 'Estimating record selectivities,' Information Systems Journal, Vol.8, No.2, pp105-115, 1983   DOI   ScienceOn
9 H. Jagadish, N. Kouda, S. Muthukrishnan, V. Poosala, K. Sevcik, T. Suel, 'Optimal Histograms with Quality Guran tees,' 24th VLDB Conference, pp.275-286, 1998
10 J S. Lim, 'Two Dimensional Signal And Image Processing,' Prentice Hall, 1990
11 P. J. Haas, J. F. Naughton, S. Seshadri and L. Stokes, 'Sampling based estimation of the number of distinct values of an attribute,' 21th VLDB Corference, 1995
12 Y. Ioannidis, 'Universality of Serial Histograms,' 19th VLDB Conference, pp.256-267, 1993
13 V. Poosala, Y. E. Ioannidis, p. J Haas, E. J. Shekita, 'Improved Histograms for Selectivity Estimation of Range Predicates,' ACM SIGMOD Corference, pp.294-305, 1996   DOI
14 C. Chen. N. Roussopoulos. 'Adaptive Selectivity Estimation Using Query Feedback,' ACM SIGMOD Conference, pp. 161-172, 1994   DOI
15 W. Chang, G. Sheikholeslami. A Zhang, T. Syeda-Mah mood, 'Efficient Resource Selection in Distributed Visual Information Systems,' ACM Multimedia Conference, pp. 203-213, 1997   DOI
16 S. Chaudhuri, L. Gravano, 'Optimizing Queries over Multimedia Repositories,' ACM SIGMOD Corference pp.91-102, 1996
17 A. Belussi, C. Faloutsos, 'Estimating the Selectivity of Spatial Queries Using the 'Correlation' Fractal Dimension,' 21th VLDB Conference, 1995