Browse > Article
http://dx.doi.org/10.3745/JIPS.2010.6.1.079

Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams  

Kim, Young-Hee (School of Information and Communication Engineering, Sungkyunkwan University)
Kim, Won-Young (School of Information and Communication Engineering, Sungkyunkwan University)
Kim, Ung-Mo (School of Information and Communication Engineering, Sungkyunkwan University)
Publication Information
Journal of Information Processing Systems / v.6, no.1, 2010 , pp. 79-90 More about this Journal
Abstract
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. Data mining over data streams should support the flexible trade-off between processing time and mining accuracy. In many application areas, mining frequent itemsets has been suggested to find important frequent itemsets by considering the weight of itemsets. In this paper, we present an efficient algorithm WSFI (Weighted Support Frequent Itemsets)-Mine with normalized weight over data streams. Moreover, we propose a novel tree structure, called the Weighted Support FP-Tree (WSFP-Tree), that stores compressed crucial information about frequent itemsets. Empirical results show that our algorithm outperforms comparative algorithms under the windowed streaming model.
Keywords
Frequent Itemsets; Weighted Support; Window Sliding; Weighted Support FP-Tree; Data Stream; WSFI-Mine;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M.M. Gaber, et al, "Mining data streams: a review", ACM SIGMOD record 34(2), pp.18-26, 2005.   DOI   ScienceOn
2 H.F Li, S.Y. Lee, M.K. Shan, "An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams", In Proceedings of First International Workshop on Knowledge Discovery in Data Streams 9IWKDDS, 2004.
3 H.F Li, S.Y. Lee, M.K. Shan, "Online Mining (Recently) Maximal Frequent Itemsets over Data Streams", In Proceedings of the 15th IEEE International Workshop on Research Issues on Data Engineering (RIDE), 2005.   DOI
4 H. Yao, H.J. Hamilton, C.J. Butz, "A Foundational Approach to Mining Itemset Utilities from Databases", In Proceedings of the 4th SIAM International Conference on Data Mining, Florida, USA, 2004.
5 C.J Chu, V.S. Tseng, T. Liang, "An efficient algorithm for mining temporal high utility itemsets from data streams", The Journal of System and Software 81, pp.1105-1117, 2008.   DOI   ScienceOn
6 C. Giannella, J, Han, J. Pei, X. Yan, P.S. Yu, "Mining Frequent Patterns in Data Streams at Multiple Time Granularities", Next Generation Data Mining, 2003.
7 C.H. Cai, A.W. Fu, C.H. Cheng, W.W. Kwong, "Mining association rules with weighted items", In Proceedings of the International Database Engineering and Applications Symposium, IDEAS98, pp.68-77, Cardiff, Wales, UK, 1998.   DOI
8 U. Yun, "Efficient Mining of weighted interesting patterns with a strong weight and/or support affinity", Information Sciences, Vol.177, pp.3477-3499, 2007.   DOI   ScienceOn
9 C.F. Ahmed, S.K. Tanbeer, B.S. Jeong, "Efficient Mining of Weighted Frequent Patterns Over Data Streams", 2009 11th International Conference on High Performance Computing and Communications, pp.400-406, June, Seoul, Korea, 2009.   DOI
10 J. Han, J. Pei, Y. Yin, R. Mao, "Mining Frequents without Candidate Generation: A Frequent-Pattern Tree Approach", Data Mining and Knowledge Discovery, No.8, pp.53-87, 2004.   DOI   ScienceOn
11 R. Agrawal, R. Srikant, "Fast Algorithms for Mining Association Rules", In Proceedings of the 20th VLDB conference, pp.487-499, 1994.
12 J. Chang, W. Lee, "A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams", Journal of Information Science and Engineering, Vol.20, No.4, July, 2004.
13 G.S. Manku, R. Motwani, "Approximate Frequency Counts Over Data Streams", In Proceedings of the 28th International Conference on Very Large Data Bases, pp.346-357, 2002.
14 C.H. Lee, C.R. Lin, M.S. Chen, "Sliding window filtering: An efficient method for incremental mining on a time-variant database", Information Systems, 30, pp.227-244, 2005.   DOI   ScienceOn
15 F. Tao, "Weighted association rule mining using weighted support and significant framework", In Proceedings of the 9th ACM SIGKDD, Knowledge Discovery and Data Mining, pp.661-666, 2003.   DOI
16 W. Wang, J. Yang, P.S, Yu, "WAR: weighted association rules for item intensities", Knowledge Information and Systems, Vol.6, pp.203-229, 2004.   DOI
17 U. Yun, J.J. Leggett, "WFIM: weighted frequent itemset mining with a weight range and a minimum weight", In Proceedings of the 15th SIAM International Conference on Data Mining (SDM’'05), pp.636-640, 2005.