Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2005.12D.6.817

Efficient Mining of Frequent Itemsets in a Sparse Data Set  

Park In-Chang (삼성전자)
Chang Joong-Hyuk (연세대학교 소프트웨어응용연구소)
Lee Won-Suk (연세대학교 컴퓨터과학과)
Abstract
The main research problems in a mining frequent itemsets are reducing memory usage and processing time of the mining process, and most of the previous algorithms for finding frequent itemsets are based on an Apriori-property, and they are multi-scan algorithms. Moreover, their processing time are greatly increased as the length of a maximal frequent itemset. To overcome this drawback, another approaches had been actively proposed in previous researches to reduce the processing time. However, they are not efficient on a sparse .data set This paper proposed an efficient mining algorithm for finding frequent itemsets. A novel tree structure, called an $L_2$-tree, was proposed int, and an efficient mining algorithm of frequent itemsets using $L_2$-tree, called an $L_2$-traverse algorithm was also proposed. An $L_2$-tree is constructed from $L_2$, i.e., a set of frequent itemsets of size 2, and an $L_2$-traverse algorithm can find its mining result in a short time by traversing the $L_2$-tree once. To reduce the processing more, this paper also proposed an optimized algorithm $C_3$-traverse, which removes previously an itemset in $L_2$ not to be a frequent itemsets of size 3. Through various experiments, it was verified that the proposed algorithms were efficient in a sparse data set.
Keywords
Frequent Itemse; Sparse Data Set; null; null; null;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R. Agarwal, C. Aggarwal, and V.V.V. Prasad. 'Depth first generation of long patterns,' In Proc. of the 6th Int. Con. Knowledge Discovery and Data Mining, pp.108-118, August, 2000   DOI
2 C. Hidber, 'Online Association Rule Mining', In Proc. of the ACM-SIGMOD Int. Conf on Management of Data, Philadelphia, PA, pp.145-156, May, 1999   DOI
3 R. Agrawal and R. Srikant. 'Fast algorithms for mining association rules,' In Proc. of the Int. Conf. on Very Large DataBases, Santiago, Chile, pp.487-499, September, 1994
4 J. Han, J. Pei, and Y. Yin. 'Mining frequent patterns without candidate generation,' In Proc. of the ACM-SIGMOD Int. Conf on Management of Data, Dallas, TA, pp.1-12, May, 2000   DOI
5 R. Agarwal, C. Aggarwal, and V.V.V. Prasad. 'A tree projection algorithm for generation of frequent itemsets,' In Journal of Parallel and Distributed Computing, Vo1.61, No. 3, pp.350-371, 2001   DOI   ScienceOn
6 J.S, Park, M.S. Chen, and P.S. Yu. 'An effective hash-based algorithm for mining association rules,' In Proc. of the ACM-SIGMOD Int. Conf. on Management of Data, San Jose, CA, pp.l75-186, May, 1995   DOI
7 S. Sarawagi, S. Thomas, and R. Agrawal. 'Integrating association rule mining with relational database systems: Alternatives and implications,' In Proc. of the ACM-SIGMOD Int. Conf. on Management of Data, Seattle, WA, pp.343-354, June, 1998   DOI   ScienceOn
8 M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo. 'Finding interesting rules from large sets of discovered association rules,' In Proc. of the 3rd Int. Conf. on Information and Knowledge Management, Gaithersburg, MD, pp.401-408, November, 1994   DOI
9 R. Srikant and R. Agrawal. 'Mining generalized association rules,' In Proc. of the Int Conf. on Very Large Data Bases, Zurich, Switzerland, pp.407-419, September, 1995
10 R. Srikant, Q. Vu, and R. Agrawal. 'Mining association rules with item constraints,' In Proc. of the 3rd Int. Conf. on Knowledge Discovery and Data Mining, Newport Beach, CA, pp.67-73, August, 1997
11 J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, 'H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases', In Proc. of the Int. Conf on Data. Mining, San Jose, CA, pp.441-448, November, 2001   DOI
12 B. Lent, A. Swami, and I. Widom. 'Clustering association rules,' In Proc. of the Int. Conf on Data Engineering, Birmingham, England, pp.220-231. April, 1997   DOI
13 A Savasere, E. Omiecinski, and S. Navathe. 'An efficient algorithm for mining association rules in large databases', In Proc. of the Int. Conf on Very Large DataBases, Zurich, Switzerland, pp.432-443, September, 1995
14 S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. 'Dynamic itemset counting and implication rules for market basket analysis,' In Proc. of the ACM-SIGMOD Int. Conf on Management of Data, Tucson, AZ, pp.255-264, May, 1997   DOI