Browse > Article
http://dx.doi.org/10.7472/jksii.2015.16.6.89

Performance Analysis of Top-K High Utility Pattern Mining Methods  

Ryang, Heungmo (Dept. of Computer Engineering, Sejong University)
Yun, Unil (Dept. of Computer Engineering, Sejong University)
Kim, Chulhong (Electronics and Telecommunication Research Institute)
Publication Information
Journal of Internet Computing and Services / v.16, no.6, 2015 , pp. 89-95 More about this Journal
Abstract
Traditional frequent pattern mining discovers valid patterns with no smaller frequency than a user-defined minimum threshold from databases. In this framework, an enormous number of patterns may be extracted by a too low threshold, which makes result analysis difficult, and a too high one may generate no valid pattern. Setting an appropriate threshold is not an easy task since it requires the prior knowledge for its domain. Therefore, a pattern mining approach that is not based on the domain knowledge became needed due to inability of the framework to predict and control mining results precisely according to the given threshold. Top-k frequent pattern mining was proposed to solve the problem, and it mines top-k important patterns without any threshold setting. Through this method, users can find patterns from ones with the highest frequency to ones with the k-th highest frequency regardless of databases. In this paper, we provide knowledge both on frequent and top-k pattern mining. Although top-k frequent pattern mining extracts top-k significant patterns without the setting, it cannot consider both item quantities in transactions and relative importance of items in databases, and this is why the method cannot meet requirements of many real-world applications. That is, patterns with low frequency can be meaningful, and vice versa, in the applications. High utility pattern mining was proposed to reflect the characteristics of non-binary databases and requires a minimum threshold. Recently, top-k high utility pattern mining has been developed, through which users can mine the desired number of high utility patterns without the prior knowledge. In this paper, we analyze two algorithms related to top-k high utility pattern mining in detail. We also conduct various experiments for the algorithms on real datasets and study improvement point and development direction of top-k high utility pattern mining through performance analysis with respect to the experimental results.
Keywords
High utility patterns; Top-K mining; Threshold setting; High utility pattern mining; Top-K high utility pattern mining; Performance analysis;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 G. Lee and U. Yun, "Analysis and Performance Evaluation of Pattern Condensing Techniques used in Representative Pattern Mining", Journal of Internet Computing and Services, Vol. 16, No. 2, pp. 77-83, 2015. http://dx.doi.org/10.7472/jksii.2015.16.2.77   DOI
2 G. Pyun and U. Yun, "Performance evaluation of approximate pattern mining based on probabilistic technique", Journal of Internet Computing and Services, Vol. 14, No. 1, pp. 63-69, 2013. http://dx.doi.org/10.7472/jksii.2013.14.63   DOI
3 J. Han, J. Pei, Y. Yin, and R. Mao, "Mining frequent patterns without Candidate Generation: A frequent-Pattern Tree Approach", Data Mining and Knowledge Discovery, Vol.8, No.1, pp.53-87, 2004. http://dx.doi.org/10.1023/B:DAMI.0000005258.31418.83   DOI
4 U. Yun and G. Lee, "A Weighted Frequent Graph Pattern Mining Approach considering Length-Decreasing Support Constraints", Journal of Internet Computing and Services, Vol. 15, No. 6, pp. 125-132, 2014. http://dx.doi.org/10.7472/jksii.2014.15.6.125   DOI
5 V.S. Tseng, B.-E. Shie, C.-W. Wu, and P.S. Yu, "Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases", IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 8, 2013, pp. 1772-1786. http://dx.doi.org/10.1109/TKDE.2012.59   DOI
6 U. Yun, H. Ryang, and K. Ryu, "High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates", Expert Systems with Applications, Vol. 41, No. 8, pp. 3861-3878, 2014. http://dx.doi.org/10.1016/j.eswa.2013.11.038   DOI
7 Q. Huynh-Thi-Le, T. Le, B. Vo, and H.B. Le, "An efficient and effective algorithm for mining top-rank-k frequent patterns", Expert Systems with Applications, Vol. 42, No. 1, pp. 156-164, 2015. http://dx.doi.org/10.1016/j.eswa.2014.07.045   DOI
8 G. Pyun and U. Yun, "Mining top-k frequent patterns with combination reducing techniques", Applied Intelligence, Vol. 41, No. 1, pp. 76-98, 2014. http://dx.doi.org/10.1007/s10489-013-0506-9   DOI
9 H. Ryang and U. Yun, "Performance Analysis of Frequent Pattern Mining with Multiple Minimum Supports", Journal of Internet Computing and Services, Vol. 14, No. 6, pp. 1-8, 2013. http://dx.doi.org/10.7472/jksii.2013.14.6.01   DOI
10 R. Agrawal, T. Imilienski, and A, Swami, "Mining association rules between set of items in large databases", ACM SIGMOD, Vol.40, No.2, pp.207-216, 1993. http://dx.doi.org/10.1145/170036.170072
11 C.-W. Wu, B.-E. Shie, V.S. Tseng, and P.S. Yu, "Mining top-K high utility itemsets", in Proc. of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012, pp. 78-86. http://dx.doi.org/10.1145/2339530.2339546
12 J. Pisharath, Y. Liu, B. Ozisikyilmaz, R. Narayanan, W.K. Liao, A. Choudhary, and Memik G, NU-MineBench version 2.0 dataset and technical report, http://cucis.ece.northwestern.edu/projects/DMS/Mineyunei@sejong.ac.kr