[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3745/KIPSTD.2012.19D.1.001

A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses

An, Hyoung-Geun (울산대학교 전기공학부)
Koh, Jae-Jin (울산대학교 전기공학부)

Publication Information

The KIPS Transactions:PartD / v.19D, no.1, 2012 , pp. 1-14 More about this Journal

Abstract

As the size of the data warehouse is large, the selection of indices on the data warehouse affects the efficiency of the query processing of the data warehouse. Indices induce the lower query processing cost, but they occupy the large storage areas and induce the index maintenance cost which are accompanied by database updates. The bitmap join indices are well applied when we optimize the star join queries which join a fact table and many dimension tables and the selection on dimension tables in data warehouses. Though the bitmap join indices with the binary representations induce the lower storage cost, the task to select the indexing attributes among the huge candidate attributes which are generated is difficult. The processes of index selection are to reduce the number of candidate attributes to be indexed and then select the indexing attributes. In this paper on bitmap join index selection problem we reduce the number of candidate attributes by the data mining techniques. Compared to the existing techniques which reduce the number of candidate attributes by the frequencies of attributes we consider the frequencies of attributes and the size of dimension tables and the size of the tuples of the dimension tables and the page size of disk. We use the mining of the frequent itemsets as mining techniques and reduce the great number of candidate attributes. We make the bitmap join indices which have the least costs and the least storage area adapted to storage constraints by using the cost functions applied to the bitmap join indices of the candidate attributes. We compare the existing techniques and ours and analyze them in order to evaluate the efficiencies of ours.

Keywords

Bitmap Join Index; Frequent ItemSets; Data Mining; Relational Data Warehouse;

Citations & Related Records

Reference

1	J. Han, J. Pei, Y. Yin, R. Mao, "Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach," Data Mining and Knowledge Discovery, Vol.8, pp.53-87, 2004. DOI ScienceOn
2	N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, "Discoverying frequent closed itemsets," ICDT, pp.398-416, 1999.
3	J. Han, J. Pei. and Y. Yin, "Mining Frequent Patterns without Candidate Generation," In Proceedings of the ACM-SIGMOD 2000 Conference, pp.1-12, 2000.
4	Yi-Hung Wu, Chai-Ming Chiang, and Arbee L. P. Chen, "Hiding Sensitive Association Rules with Limited Side Effects," IEEE Transactions on Knowledge and Data Engineering, Vol.19, Issue 1, pp.29-42, 2007. DOI ScienceOn
5	H. Mannila and H. Toivonen, "Levelwise search and borders of theories in knowledge discovery," Data Mining and Knowledge Discovery, Vol.1, No.3, pp.241-258, 1997. DOI
6	3. Ladjel Bellatreche, "A Data Mining Approach for Selecting Bitmap Join Indices," Journal of Computing Science and Engineering, Vol.1, No.2, December, 2007.
7	http://www.almaden.ibm.com/cs/projects/iis/hdb/Projects/ data_mining/mining.shtml
8	S. Chaudhuri, and V. Narasayya, "Self-tuning database systems: A decade of progress," Proc. of the Intl. Conf. on VLDB, pp.3-14, 2007.
9	M. Golfarelli, S. Rizzi, and E. Saltarelli, "Index Selection for data warehousing," Proc. 4th Intl. Workshopon Design and Management of DataWarehouse, pp.33-42, 2002.
10	P. O'Neil, and G. Graefe, "Multi-table joins through bitmapped join indices," SIGMOD Record 24, No.3, pp.8-11, 1995. DOI ScienceOn
11	C. Y. Chan, and Y. E. Ioannidis, "Bitmap index design and evaluation," Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp.355-366, 1998.
12	C. Chee-Yong, "Indexing techniques in decision support systems," Phd. Thesis, University of Wisconsin-Madison, 1999.
13	P. Valduriez, "Join Indices," ACM Trans. On Database Systems 12, 2, pp.218-246, June, 1987. DOI ScienceOn
14	S. Chaudhuri, "Index selection for databases: A hardness study and a principled heuristic solution," IEEE Trans. On Knowledge and Data Eng., pp.1313-1323, 2004.
15	K. Aouiche, O. Boussaid, and F. Bentayeb, "Automatic selection of bitmap join indices in data warehouse," 7th Intl. Conf. on DataWarehouse and Knowledge Didcovery, pp.64-73, 2005.
16	S. Chaudhuri, and V. Narasayya, "An efficient cost-driven index selection tool for Microsoft SQL server," Proc. of the Intl. Conf. on VLDB, pp.146-155, 1997.
17	R. Agrawal and R. Srikant, "Mining Sequential Patterns," Proc. of the 11th International Conference on Data Engineering(ICDE'95), pp.3-14, 1995.
18	D. Burdick, M. Calimlim, and J. Gehrke, "Mafia: a maximal frequent itemset algorithm for transaction databases," ICDB01, pp.443-452, 2001.

KSCI

A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses 관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구

A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses