Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2012.19D.1.001

A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses  

An, Hyoung-Geun (울산대학교 전기공학부)
Koh, Jae-Jin (울산대학교 전기공학부)
Abstract
As the size of the data warehouse is large, the selection of indices on the data warehouse affects the efficiency of the query processing of the data warehouse. Indices induce the lower query processing cost, but they occupy the large storage areas and induce the index maintenance cost which are accompanied by database updates. The bitmap join indices are well applied when we optimize the star join queries which join a fact table and many dimension tables and the selection on dimension tables in data warehouses. Though the bitmap join indices with the binary representations induce the lower storage cost, the task to select the indexing attributes among the huge candidate attributes which are generated is difficult. The processes of index selection are to reduce the number of candidate attributes to be indexed and then select the indexing attributes. In this paper on bitmap join index selection problem we reduce the number of candidate attributes by the data mining techniques. Compared to the existing techniques which reduce the number of candidate attributes by the frequencies of attributes we consider the frequencies of attributes and the size of dimension tables and the size of the tuples of the dimension tables and the page size of disk. We use the mining of the frequent itemsets as mining techniques and reduce the great number of candidate attributes. We make the bitmap join indices which have the least costs and the least storage area adapted to storage constraints by using the cost functions applied to the bitmap join indices of the candidate attributes. We compare the existing techniques and ours and analyze them in order to evaluate the efficiencies of ours.
Keywords
Bitmap Join Index; Frequent ItemSets; Data Mining; Relational Data Warehouse;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Han, J. Pei, Y. Yin, R. Mao, "Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach," Data Mining and Knowledge Discovery, Vol.8, pp.53-87, 2004.   DOI   ScienceOn
2 N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, "Discoverying frequent closed itemsets," ICDT, pp.398-416, 1999.
3 J. Han, J. Pei. and Y. Yin, "Mining Frequent Patterns without Candidate Generation," In Proceedings of the ACM-SIGMOD 2000 Conference, pp.1-12, 2000.
4 Yi-Hung Wu, Chai-Ming Chiang, and Arbee L. P. Chen, "Hiding Sensitive Association Rules with Limited Side Effects," IEEE Transactions on Knowledge and Data Engineering, Vol.19, Issue 1, pp.29-42, 2007.   DOI   ScienceOn
5 H. Mannila and H. Toivonen, "Levelwise search and borders of theories in knowledge discovery," Data Mining and Knowledge Discovery, Vol.1, No.3, pp.241-258, 1997.   DOI
6 3. Ladjel Bellatreche, "A Data Mining Approach for Selecting Bitmap Join Indices," Journal of Computing Science and Engineering, Vol.1, No.2, December, 2007.
7 http://www.almaden.ibm.com/cs/projects/iis/hdb/Projects/ data_mining/mining.shtml
8 S. Chaudhuri, and V. Narasayya, "Self-tuning database systems: A decade of progress," Proc. of the Intl. Conf. on VLDB, pp.3-14, 2007.
9 M. Golfarelli, S. Rizzi, and E. Saltarelli, "Index Selection for data warehousing," Proc. 4th Intl. Workshopon Design and Management of DataWarehouse, pp.33-42, 2002.
10 P. O'Neil, and G. Graefe, "Multi-table joins through bitmapped join indices," SIGMOD Record 24, No.3, pp.8-11, 1995.   DOI   ScienceOn
11 C. Y. Chan, and Y. E. Ioannidis, "Bitmap index design and evaluation," Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp.355-366, 1998.
12 C. Chee-Yong, "Indexing techniques in decision support systems," Phd. Thesis, University of Wisconsin-Madison, 1999.
13 P. Valduriez, "Join Indices," ACM Trans. On Database Systems 12, 2, pp.218-246, June, 1987.   DOI   ScienceOn
14 S. Chaudhuri, "Index selection for databases: A hardness study and a principled heuristic solution," IEEE Trans. On Knowledge and Data Eng., pp.1313-1323, 2004.
15 K. Aouiche, O. Boussaid, and F. Bentayeb, "Automatic selection of bitmap join indices in data warehouse," 7th Intl. Conf. on DataWarehouse and Knowledge Didcovery, pp.64-73, 2005.
16 S. Chaudhuri, and V. Narasayya, "An efficient cost-driven index selection tool for Microsoft SQL server," Proc. of the Intl. Conf. on VLDB, pp.146-155, 1997.
17 R. Agrawal and R. Srikant, "Mining Sequential Patterns," Proc. of the 11th International Conference on Data Engineering(ICDE'95), pp.3-14, 1995.
18 D. Burdick, M. Calimlim, and J. Gehrke, "Mafia: a maximal frequent itemset algorithm for transaction databases," ICDB01, pp.443-452, 2001.