[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3745/KIPSTD.2005.12D.6.817

Efficient Mining of Frequent Itemsets in a Sparse Data Set

Park In-Chang (삼성전자)
Chang Joong-Hyuk (연세대학교 소프트웨어응용연구소)
Lee Won-Suk (연세대학교 컴퓨터과학과)

Publication Information

The KIPS Transactions:PartD / v.12D, no.6, 2005 , pp. 817-828 More about this Journal

Abstract

The main research problems in a mining frequent itemsets are reducing memory usage and processing time of the mining process, and most of the previous algorithms for finding frequent itemsets are based on an Apriori-property, and they are multi-scan algorithms. Moreover, their processing time are greatly increased as the length of a maximal frequent itemset. To overcome this drawback, another approaches had been actively proposed in previous researches to reduce the processing time. However, they are not efficient on a sparse .data set This paper proposed an efficient mining algorithm for finding frequent itemsets. A novel tree structure, called an $L_2$ -tree, was proposed int, and an efficient mining algorithm of frequent itemsets using $L_2$ -tree, called an $L_2$ -traverse algorithm was also proposed. An $L_2$ -tree is constructed from $L_2$ , i.e., a set of frequent itemsets of size 2, and an $L_2$ -traverse algorithm can find its mining result in a short time by traversing the $L_2$ -tree once. To reduce the processing more, this paper also proposed an optimized algorithm $C_3$ -traverse, which removes previously an itemset in $L_2$ not to be a frequent itemsets of size 3. Through various experiments, it was verified that the proposed algorithms were efficient in a sparse data set.

Keywords

Frequent Itemse; Sparse Data Set; null; null; null;

Citations & Related Records

Reference

1	R. Agarwal, C. Aggarwal, and V.V.V. Prasad. 'Depth first generation of long patterns,' In Proc. of the 6th Int. Con. Knowledge Discovery and Data Mining, pp.108-118, August, 2000 DOI
2	C. Hidber, 'Online Association Rule Mining', In Proc. of the ACM-SIGMOD Int. Conf on Management of Data, Philadelphia, PA, pp.145-156, May, 1999 DOI
3	R. Agrawal and R. Srikant. 'Fast algorithms for mining association rules,' In Proc. of the Int. Conf. on Very Large DataBases, Santiago, Chile, pp.487-499, September, 1994
4	J. Han, J. Pei, and Y. Yin. 'Mining frequent patterns without candidate generation,' In Proc. of the ACM-SIGMOD Int. Conf on Management of Data, Dallas, TA, pp.1-12, May, 2000 DOI
5	R. Agarwal, C. Aggarwal, and V.V.V. Prasad. 'A tree projection algorithm for generation of frequent itemsets,' In Journal of Parallel and Distributed Computing, Vo1.61, No. 3, pp.350-371, 2001 DOI ScienceOn
6	J.S, Park, M.S. Chen, and P.S. Yu. 'An effective hash-based algorithm for mining association rules,' In Proc. of the ACM-SIGMOD Int. Conf. on Management of Data, San Jose, CA, pp.l75-186, May, 1995 DOI
7	S. Sarawagi, S. Thomas, and R. Agrawal. 'Integrating association rule mining with relational database systems: Alternatives and implications,' In Proc. of the ACM-SIGMOD Int. Conf. on Management of Data, Seattle, WA, pp.343-354, June, 1998 DOI ScienceOn
8	M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo. 'Finding interesting rules from large sets of discovered association rules,' In Proc. of the 3rd Int. Conf. on Information and Knowledge Management, Gaithersburg, MD, pp.401-408, November, 1994 DOI
9	R. Srikant and R. Agrawal. 'Mining generalized association rules,' In Proc. of the Int Conf. on Very Large Data Bases, Zurich, Switzerland, pp.407-419, September, 1995
10	R. Srikant, Q. Vu, and R. Agrawal. 'Mining association rules with item constraints,' In Proc. of the 3rd Int. Conf. on Knowledge Discovery and Data Mining, Newport Beach, CA, pp.67-73, August, 1997
11	J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, 'H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases', In Proc. of the Int. Conf on Data. Mining, San Jose, CA, pp.441-448, November, 2001 DOI
12	B. Lent, A. Swami, and I. Widom. 'Clustering association rules,' In Proc. of the Int. Conf on Data Engineering, Birmingham, England, pp.220-231. April, 1997 DOI
13	A Savasere, E. Omiecinski, and S. Navathe. 'An efficient algorithm for mining association rules in large databases', In Proc. of the Int. Conf on Very Large DataBases, Zurich, Switzerland, pp.432-443, September, 1995
14	S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. 'Dynamic itemset counting and implication rules for market basket analysis,' In Proc. of the ACM-SIGMOD Int. Conf on Management of Data, Tucson, AZ, pp.255-264, May, 1997 DOI

KSCI

Efficient Mining of Frequent Itemsets in a Sparse Data Set 희소 데이터 집합에서 효율적인 빈발 항목집합 탐사 기법

Efficient Mining of Frequent Itemsets in a Sparse Data Set