Browse > Article
http://dx.doi.org/10.3745/JIPS.04.0209

Effective and Efficient Similarity Measures for Purchase Histories Considering Product Taxonomy  

Yang, Yu-Jeong (Dept. of Computer Science, Sookmyung Women's University)
Lee, Ki Yong (Dept. of Computer Science, Sookmyung Women's University)
Publication Information
Journal of Information Processing Systems / v.17, no.1, 2021 , pp. 107-123 More about this Journal
Abstract
In an online shopping site or offline store, products purchased by each customer over time form the purchase history of the customer. Also, in most retailers, products have a product taxonomy, which represents a hierarchical classification of products. Considering the product taxonomy, the lower the level of the category to which two products both belong, the more similar the two products. However, there has been little work on similarity measures for sequences considering a hierarchical classification of elements. In this paper, we propose new similarity measures for purchase histories considering not only the purchase order of products but also the hierarchical classification of products. Unlike the existing methods, where the similarity between two elements in sequences is only 0 or 1 depending on whether two elements are the same or not, the proposed method can assign any real number between 0 and 1 considering the hierarchical classification of elements. We apply this idea to extend three existing representative similarity measures for sequences. We also propose an efficient computation method for the proposed similarity measures. Through various experiments, we show that the proposed method can measure the similarity between purchase histories very effectively and efficiently.
Keywords
Hierarchical Classification; Purchase History; Sequence Similarity; Similarity Measure;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 I. Boulnemour and B. Boucheham, "QP-DTW: upgrading dynamic time warping to handle quasi periodic time series alignment," Journal of Information Processing Systems, vol. 14, no. 4, pp. 851-876, 2018   DOI
2 F. P. Preparata and M. I. Shamos, Computational Geometry: An Introduction. New York, NY: Springer Science & Business Media, 1985.
3 M. A. Bender, M. Farach-Colton, G. Pemmasani, S. Skiena, and P. Sumazin, "Lowest common ancestors in trees and directed acyclic graphs," Journal of Algorithms, vol. 57, no. 2, pp. 75-94, 2005.   DOI
4 C. F. Su, "High-speed packet classification using segment tree," in Proceedings of IEEE Global Telecommunications Conference (Cat. No. 00CH37137), San Francisco, CA, 2000, pp. 582-586.
5 M. Sforna, "Data mining in a power company customer database," Electric Power Systems Research, vol. 55, no. 3, pp. 201-209, 2000.   DOI
6 C. Rygielski, J. C. Wang, and D. C. Yen, "Data mining techniques for customer relationship management," Technology in Society, vol. 24, no. 4, pp. 483-502, 2002.   DOI
7 M. Kaur and S. Kang, "Market Basket Analysis: identify the changing trends of market data using association rule mining," Procedia Computer Science, vol. 85, pp. 78-85, 2016.   DOI
8 C. Yin, S. Ding, and J. Wang, "Mobile marketing recommendation method based on user location feedback," Human-centric Computing and Information Sciences, vol. 9, article no. 14, 2019.
9 V. I. Levenshtein, "Binary codes capable of correcting deletions, insertions, and reversals," in Soviet Physics Doklady, vol. 10, no. 8, pp. 707-710, 1996.
10 S. B. Needleman and C. D. Wunsch, "A general method applicable to the search for similarities in the amino acid sequence of two proteins," Journal of Molecular Biology, vol. 48, no. 3, pp. 443-453, 1970.   DOI
11 D. J. Berndt and J. Clifford, "Using dynamic time warping to find patterns in time series," in Knowledge Discovery in Databases: Papers from the 1994 AAAI Workshop, Seattle, Washington. Melon Park, CA: AAAI Press, 1994, pp. 359-370.
12 M. H. Pandi, O. Kashefi, and B. Minaei, "A novel similarity measure for sequence data," Journal of Information Processing Systems, vol. 7, no. 3, pp. 413-424, 2011.   DOI
13 S. Park, N. C. Suresh, and B. K. Jeong, "Sequence-based clustering for Web usage mining: a new experimental framework and ANN-enhanced K-means algorithm," Data & Knowledge Engineering, vol. 65, no. 3, pp. 512-543, 2008.   DOI
14 E. Zorita, P. Cusco, and G. J. Filion, "Starcode: sequence clustering based on all-pairs search," Bioinformatics, vol. 31, no. 12, pp. 1913-1919, 2015.   DOI
15 M. A. Alqarni, S. H. Chauhdary, M. N. Malik, M. Ehatisham-ul-Haq, and M. A. Azam, "Identifying smartphone users based on how they interact with their phones," Human-centric Computing and Information Sciences, vol. 10, article no. 7, 2020.
16 X. Sun and J. Zhang, "miRNA pattern discovery from sequence alignment," Journal of Information Processing Systems, vol. 13, no. 6, pp. 1527-1543, 2017.   DOI
17 P. Senin, "Dynamic time warping algorithm review," Information and Computer Science Department, University of Hawaii at Manoa, Honolulu, HI, 2008.