Browse > Article
http://dx.doi.org/10.7232/iems.2014.13.4.421

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem  

Hwang, Wook-Yeon (Data Analytics Department, Institute for Infocomm Research, A*STAR)
Jun, Chi-Hyuck (Department of Industrial and Management Engineering, Pohang University of Science and Technology)
Publication Information
Industrial Engineering and Management Systems / v.13, no.4, 2014 , pp. 421-431 More about this Journal
Abstract
The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.
Keywords
Market Basket Data; Cold-Start Problem; Supervised Learning-Based Collaborative Filtering; Random Forest; Elastic Net;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Hahsler, M. (2014), recommenderlab: a framework for developing and testing recommendation algorithms, http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf.
2 Hill, W., Stead, L., Rosenstein, M., and Furnas, G. (1995), Recommending and evaluating choices in a virtual community of use, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Denver, CO, 194-201.
3 Hoerl, A. E. and Kennard, R. W. (1970), Ridge regression: biased estimation for nonorthogonal problems, Technomerics, 12(1), 55-67.   DOI   ScienceOn
4 Hwang, W. Y. and Lee, J. S. (2013), Shifting artificial data to detect system failures, International Transactions in Operational Research, Advanced online publication, doi: 10.1111/itor.12047.   DOI   ScienceOn
5 Lee, C. H., Kim, Y. H., and Rhee, P. K. (2001), Web personalization expert with combining collaborative fil-tering and association rule mining technique, Expert Systems with Applications, 21(3), 131-137.   DOI   ScienceOn
6 Lee, J. S. and Olafsson, S. (2009), Two-way cooperative prediction for collaborative filtering recommendations, Expert Systems with Applications, 36(3), 5353-5361.   DOI   ScienceOn
7 Lee, J. S., Jun, C. H., Lee, J. W., and Kim, S. Y. (2005), Classification-based collaborative filtering using market basket data, Expert Systems with Applications, 29(3), 700-704.   DOI   ScienceOn
8 Leung, C. W., Chan, S. C., and Chung, F. (2008), An empirical study of a cross-level association rule mining approach to cold-start recommendations, Knowledge-Based Systems, 21(7), 515-529.   DOI   ScienceOn
9 Lika, B., Kholomvatsos, K., and Hadjiefthymiades, S. (2014), Facing the cold start problem in recommender systems, Expert Systems with Applications, 41(4), 2065-2073.   DOI   ScienceOn
10 Mild, A. and Reutterer, T. (2001), Collaborative filtering methods for binary market basket data analysis, Active Media Technology, Lecture Notes in Computer Science, 2252, 302-313.   DOI   ScienceOn
11 Mild, A. and Reutterer, T. (2003), An improved collaborative filtering approach for predicting cross-category purchase based on binary market basket data, Journal of Retailing and Consumer Services, 10(3), 123-133.   DOI   ScienceOn
12 Park, D. H., Kim, H. K., Choi, I. Y., and Kim, J. K. (2012), A literature review and classification of recommender systems research, Expert Systems with Applications, 39(11), 10059-10072.   DOI   ScienceOn
13 Park, S. T. and Chu, W. (2009), Pairwise preference regression for cold-start recommendation, Proceedings of the third ACM Conference on Recommender Systems (RecSys2009), New York, NY, 21-28.
14 Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J. (1994), GroupLens: an open architecture for collaborative filtering of netnews, Proceedings of the ACM Conference on Computer Supported Cooperative (CSCW1994), Chapel Hill, NC, 175-186.
15 Sarwar, B., Karypis, G., Konstan, J., and Riedl, J. (2001), Item-based collaborative filtering recommendation algorithms, Proceedings of the 10th international World Wide Web Conference (WWW10), Hong Kong, 285-295.
16 Schein, A., Popescul A., and Ungar, L. H. (2002), Methods and metrics for cold-start recommendations, Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 253-260.
17 Shardanand, U. and Maes, P. (1995), Social information filtering: algorithms for automating word of mouth, Proceedings of ACM Conference on Human Factors in Computing Systems (CHI1995), Vancouver, Canada, 210-217.
18 Ahn, H. J. (2008), A new similarity measure for collaborative filtering to alleviate the new user cold-starting problems, Information Sciences, 178(1), 37-51.   DOI   ScienceOn
19 Tibshirani, R. (1996), Regression shrinkage and selection via the lasso, Journal of Royal Statistical Society Series B: Methodological, 58(1), 267-288.
20 Zou, H. and Hastie, T. (2005), Regularization and variable selection via the elastic net, Journal of Royal Statistical Society Series B: Statistical Methodology, 67(2), 301-320.   DOI   ScienceOn
21 Breese, J. S., Heckerman, D., and Kadie, C. (1998), Empirical analysis of predictive algorithms for collaborative filtering, Technical Report MSR-TR-98-12, Microsoft Research, Redmond, WA.
22 Breiman, L. (2001), Random forests, Machine Learning, 45(1), 5-32.   DOI   ScienceOn
23 Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1999), Classification and Regression Trees, CRC Press, New York, NY.
24 Friedman, J., Hastie, T., and Tibshirani, R. (2009), Regularization paths for generalized linear models via coordinate descent, Department of Statistics, Stanford University, Stanford, CA.
25 Hastie, T., Tibsharani, R., and Friedman, J. (2001), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, New York, NY.
26 Goldberg, D., Nichols, D., Oki, B., and Terry, D. (1992), Using collaborative filtering to weave an information tapestry, Communications of the ACM, 35(12), 61-70.
27 Goldberg, K., Roeder, T., Gupta, D., and Perkins, C. (2001), Eigentaste: a constant time collaborative filtering algorithm, Information Retrieval Journal, 4(2), 133-151.   DOI   ScienceOn