Browse > Article

Improved Association Rule Mining by Modified Trimming  

Hwang, Won-Tae (School of Electrical and Electronics Engineering, Chung-Ang University)
Kim, Dong-Seung (School of Electrical Engineering, Korea University)
Publication Information
Abstract
This paper presents a new association mining algorithm that uses two phase sampling for shortening the execution time at the cost of precision of the mining result. Previous FAST(Finding Association by Sampling Technique) algorithm has the weakness in that it only considered the frequent 1-itemsets in trimming/growing, thus, it did not have ways of considering mulit-itemsets including 2-itemsets. The new algorithm reflects the multi-itemsets in sampling transactions. It improves the mining results by adjusting the counts of both missing itemsets and false itemsets. Experimentally, on a representative synthetic database, the algorithm produces a sampled subset of results with an increased accuracy in terms of the 2-itemsets while it maintains the same 1uality of the data set.
Keywords
data mining; association rule mining; random sampling;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R. Agrawal and R. Srikant. "Fast algorithms for mining association rules". In Proc. VLDB Conf., 1994, pp.487-499
2 J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation", SIGMOD, 2000
3 G. Liu, H. Liu, Y. Xu, and J.X. Yu, "Ascending frequency ordered prefix-tree: efficient mining of frequent patterns", Procs. DASFAA 200
4 B. Chen, P. Haas, and P. Scheuermann, "A new two-phase sampling based algorithm for discovering association rules", SIGKDD, 2002
5 I. Pramudiono and M. Kitsuregawa, "Parallel FP-growth on PC cluster", In Proc. 7th Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 467-473, 2003   DOI   ScienceOn
6 이문환 (M. Lee), Improved Association Rule Mining Based on FAST(Finding Associations from Sampled Transactions) Algorithm, master thesis, Korea University, July, 2004
7 M. Lee and D. Kim, "Modified association rule mining based on two-stage data sampling", Procs. KISS (Korea Information Systems Society) Conf. on Parallel Processing System, Vol. 16 No. 1, pp.69-74, Jan. 2005
8 R. Toivonen, "Sampling large databases for association rules", In Proc. VLDB Conf., 1996