Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2006.13D.5.651

An Efficient Hashing Mechanism of the DHP Algorithm for Mining Association Rules  

Lee, Hyung-Bong (강릉대학교 컴퓨터공학과)
Abstract
Algorithms for mining association rules based on the Apriori algorithm use the hash tree data structure for storing and counting supports of the candidate frequent itemsets and the most part of the execution time is consumed for searching in the hash tree. The DHP(Direct Hashing and Pruning) algorithm makes efforts to reduce the number of the candidate frequent itemsets to save searching time in the hash tree. For this purpose, the DHP algorithm does preparative simple counting supports of the candidate frequent itemsets. At this time, the DHP algorithm uses the direct hash table to reduce the overhead of the preparative counting supports. This paper proposes and evaluates an efficient hashing mechanism for the direct hash table $H_2$ which is for pruning in phase 2 and the hash tree $C_k$, which is for counting supports of the candidate frequent itemsets in all phases. The results showed that the performance improvement due to the proposed hashing mechanism was 82.2% on the maximum and 18.5% on the average compared to the conventional method using a simple mod operation.
Keywords
DHP; Direct Hash Table; Hash Tree;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Zarka Cvetanovic, Darrel D. Donaldson, Jane, 'AlphaServer 4100 Performance Characterization', Digital Technical Journal, Vol.8, No 4., pp.3-20, 1996, http://www.hpl.hp.com/hpjournal/dti/vol8num4/vol8num4art1. pdf
2 이형봉, '완전 해상을 위한 DHP 연관 규칙 탐사 알고리즘의 개선 방안', 정보과학회 논문지 : 데이터베이스, 제31권, 제2호, pp.91-98, 2004
3 A. Savasere, E. Omiecinski and S. Navathe, 'An Efficient Algorithm for Mining Association Rules in Large Databases', Proceedings of the 21th VLDB Conference, pp.432-444, 199
4 이재문, '대용량 주기억장치 시스템에서 효율적인 연관 규칙 탐사 알고리즘', 정보처리학회 논문지D 제9-D권, 제4호, pp.579-586, 2002   과학기술학회마을   DOI
5 이재문, 박종수, '복합 해쉬 트리를 이용한 효율적인 연관 규칙 탐사 알고리즘', 정보과학회 논문지(B) 제 26권, 제3호, pp343-352, 1999
6 Jiawei Han, Jian Pei, and Yiwen Yin, 'Mining frequent patterns without candidate generation', Proceedings of 2000 ACM SIGMOD Int. Conf. Management of Data(SIGMOD'00), Dallas, TX, pp.1-12   DOI
7 박종수, 유원경, 홍기영, '연관 규칙 탐사와 그 응용', 정보과학회지, 제16권, 제 9호, pp.37-44, 1998   과학기술학회마을
8 R. Agrawal and et al, 'Synthetic Data Generation Code for Associations and Sequential Patterns', http://www. almaden.ibm.com/cs/quest, 1999
9 R. Agrawal and R. Srikant, 'Fast Algorithms for Mining Association Rules', Proceedings of the 20th International Conference on Very Large Databases, pp.487 -499, 1994
10 J. S. Park, M.-S. Chen and P. S. Yu, 'An Effective Hash-Based Algorithm for Mining Association Rules', Proceedings of ACM SIGMOD, pp.175-186, 1995   DOI
11 R. Agrawal, T. Imielinski and A. Swami, 'Mining Association Rules between Sets of Items in Large Databases', Proceedings of ACM SIGMOD on Management of Data, pp.207-216, 1993   DOI