Browse > Article
http://dx.doi.org/10.3745/KTSDE.2018.7.2.51

Mining High Utility Sequential Patterns Using Sequence Utility Lists  

Park, Jong Soo (성신여자대학교 IT학부)
Publication Information
KIPS Transactions on Software and Data Engineering / v.7, no.2, 2018 , pp. 51-62 More about this Journal
Abstract
High utility sequential pattern (HUSP) mining has been considered as an important research topic in data mining. Although some algorithms have been proposed for this topic, they incur the problem of producing a large search space for HUSPs. The tighter utility upper bound of a sequence can prune more unpromising patterns early in the search space. In this paper, we propose a sequence expected utility (SEU) as a new utility upper bound of each sequence, which is the maximum expected utility of a sequence and all its descendant sequences. A sequence utility list for each pattern is used as a new data structure to maintain essential information for mining HUSPs. We devise an algorithm, high sequence utility list-span (HSUL-Span), to identify HUSPs by employing SEU. Experimental results on both synthetic and real datasets from different domains show that HSUL-Span generates considerably less candidate patterns and outperforms other algorithms in terms of execution time.
Keywords
High Utility Sequential Pattern Mining; Sequence Utility List; Candidate Pattern Pruning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 C. F. Ahmed, S. K. Tanbeer, and B. Jeong. "A novel approach for mining high-utility sequential patterns in sequence databases," Electronics and Telecommunications Research Institute Journal, Vol.32, No.5, pp.676-686, 2010.
2 J. Yin, Z. Zheng, and L. Cao. "USpan: An efficient algorithm for mining high utility sequential patterns," in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, pp.660-668, 2012.
3 O. K. Alkan and P. Karagoz, "CRoM and HuspExt: Improving Efficiency of High Utility Sequential Pattern Extraction," IEEE Transactions on Knowledge and Data Engineering, Vol.27, No.10, pp.2645-2657, 2015.   DOI
4 J.-Z. Wang, J.-L. Huang, and Y.-C. Chen, "On efficiently mining high utility sequential patterns," Knowledge and Information Systems, Vol.49, Issue 2, pp.597-627, 2016.   DOI
5 M. Zihayat, C.-W. Wu , A. An and V. S. Tseng, "Efficiently Mining High Utility Sequential Patterns in Static and Streaming Data," Intelligent Data Analysis, Vol.21, No.S1, pp.S103-S135, 2017.   DOI
6 P. Fournier-Viger, An Open-Source Data Mining Library [Internet], http://www.philippe-fournier-viger.com/spmf/in dex.php, 2017.
7 R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," in Proceedings of the 20th Very Large Data Base Conference, Santiago, pp.487-499, 1994.
8 J. S. Park, M.-S. Chen, and P. S. Yu, "An effective hash-based algorithm for mining association rules," in Proceedings of the 1995 ACM SIGMOD international Conference on Management of Data, San Jose, pp.175-186, 1995.
9 R. Agrawal and R. Srikant, "Mining sequential patterns," in Proceedings of the Eleventh International Conference on Data Engineering, Taipei, pp.3-14, 1995.
10 J. Han, J. Pei, and Y. Yin, "Mining Frequent Patterns without Candidate Generation," in Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, pp.1-12, 2000.
11 J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu, "PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth," in Proceedings 17th International Conference on Data Engineering, Heidelberg, pp.215-224, 2001.
12 N. R. Mabroukeh and C. I. Ezeife, "A Taxonomy of Sequential Pattern Mining Algorithms," ACM Computing Surveys, Vol.43, No.1, Article 3, 2010.
13 Y. Liu, W. Liao, and A. Choudhary, "A two-phase algorithm for fast discovery of high utility itemsets," in Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, pp.689-695, 2005.
14 B.-S. Jeong, C. F. Ahmed, I. Lee, and H. Yong, "High utility pattern mining using a prefix-tree," Journal of KIISE: Database, Vol.36, No.5, pp.341-351, .
15 C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, "Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases," IEEE Transactions on Knowledge and Data Engineering, Vol.21, No.12, pp.1708-1721, 2009.   DOI
16 V. S. Tseng, C.-W. Wu, B.-E. Shie, and P. S. Yu, "UP-Growth: an efficient algorithm for high utility itemset mining," in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, pp.253-262, 2010.
17 S. Lee and J. S. Park, "High Utility Itemset Mining Using Transaction Utility of Itemsets," KIPS Transactions on Software and Data Engineering, Vol.4, No.11, pp.499-508, 2015.   DOI