DOI QR코드

DOI QR Code

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

  • Hwang, Wook-Yeon (Data Analytics Department, Institute for Infocomm Research, A*STAR) ;
  • Jun, Chi-Hyuck (Department of Industrial and Management Engineering, Pohang University of Science and Technology)
  • Received : 2014.07.12
  • Accepted : 2014.11.07
  • Published : 2014.12.30

Abstract

The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.

Keywords

References

  1. Ahn, H. J. (2008), A new similarity measure for collaborative filtering to alleviate the new user cold-starting problems, Information Sciences, 178(1), 37-51. https://doi.org/10.1016/j.ins.2007.07.024
  2. Breese, J. S., Heckerman, D., and Kadie, C. (1998), Empirical analysis of predictive algorithms for collaborative filtering, Technical Report MSR-TR-98-12, Microsoft Research, Redmond, WA.
  3. Breiman, L. (2001), Random forests, Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
  4. Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1999), Classification and Regression Trees, CRC Press, New York, NY.
  5. Friedman, J., Hastie, T., and Tibshirani, R. (2009), Regularization paths for generalized linear models via coordinate descent, Department of Statistics, Stanford University, Stanford, CA.
  6. Goldberg, D., Nichols, D., Oki, B., and Terry, D. (1992), Using collaborative filtering to weave an information tapestry, Communications of the ACM, 35(12), 61-70.
  7. Goldberg, K., Roeder, T., Gupta, D., and Perkins, C. (2001), Eigentaste: a constant time collaborative filtering algorithm, Information Retrieval Journal, 4(2), 133-151. https://doi.org/10.1023/A:1011419012209
  8. Hahsler, M. (2014), recommenderlab: a framework for developing and testing recommendation algorithms, http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf.
  9. Hastie, T., Tibsharani, R., and Friedman, J. (2001), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, New York, NY.
  10. Hill, W., Stead, L., Rosenstein, M., and Furnas, G. (1995), Recommending and evaluating choices in a virtual community of use, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Denver, CO, 194-201.
  11. Hoerl, A. E. and Kennard, R. W. (1970), Ridge regression: biased estimation for nonorthogonal problems, Technomerics, 12(1), 55-67. https://doi.org/10.1080/00401706.1970.10488634
  12. Hwang, W. Y. and Lee, J. S. (2013), Shifting artificial data to detect system failures, International Transactions in Operational Research, Advanced online publication, doi: 10.1111/itor.12047.
  13. Lee, C. H., Kim, Y. H., and Rhee, P. K. (2001), Web personalization expert with combining collaborative fil-tering and association rule mining technique, Expert Systems with Applications, 21(3), 131-137. https://doi.org/10.1016/S0957-4174(01)00034-3
  14. Lee, J. S. and Olafsson, S. (2009), Two-way cooperative prediction for collaborative filtering recommendations, Expert Systems with Applications, 36(3), 5353-5361. https://doi.org/10.1016/j.eswa.2008.06.106
  15. Lee, J. S., Jun, C. H., Lee, J. W., and Kim, S. Y. (2005), Classification-based collaborative filtering using market basket data, Expert Systems with Applications, 29(3), 700-704. https://doi.org/10.1016/j.eswa.2005.04.037
  16. Leung, C. W., Chan, S. C., and Chung, F. (2008), An empirical study of a cross-level association rule mining approach to cold-start recommendations, Knowledge-Based Systems, 21(7), 515-529. https://doi.org/10.1016/j.knosys.2008.03.012
  17. Lika, B., Kholomvatsos, K., and Hadjiefthymiades, S. (2014), Facing the cold start problem in recommender systems, Expert Systems with Applications, 41(4), 2065-2073. https://doi.org/10.1016/j.eswa.2013.09.005
  18. Mild, A. and Reutterer, T. (2001), Collaborative filtering methods for binary market basket data analysis, Active Media Technology, Lecture Notes in Computer Science, 2252, 302-313. https://doi.org/10.1007/3-540-45336-9_35
  19. Mild, A. and Reutterer, T. (2003), An improved collaborative filtering approach for predicting cross-category purchase based on binary market basket data, Journal of Retailing and Consumer Services, 10(3), 123-133. https://doi.org/10.1016/S0969-6989(03)00003-1
  20. Park, D. H., Kim, H. K., Choi, I. Y., and Kim, J. K. (2012), A literature review and classification of recommender systems research, Expert Systems with Applications, 39(11), 10059-10072. https://doi.org/10.1016/j.eswa.2012.02.038
  21. Park, S. T. and Chu, W. (2009), Pairwise preference regression for cold-start recommendation, Proceedings of the third ACM Conference on Recommender Systems (RecSys2009), New York, NY, 21-28.
  22. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J. (1994), GroupLens: an open architecture for collaborative filtering of netnews, Proceedings of the ACM Conference on Computer Supported Cooperative (CSCW1994), Chapel Hill, NC, 175-186.
  23. Sarwar, B., Karypis, G., Konstan, J., and Riedl, J. (2001), Item-based collaborative filtering recommendation algorithms, Proceedings of the 10th international World Wide Web Conference (WWW10), Hong Kong, 285-295.
  24. Schein, A., Popescul A., and Ungar, L. H. (2002), Methods and metrics for cold-start recommendations, Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 253-260.
  25. Shardanand, U. and Maes, P. (1995), Social information filtering: algorithms for automating word of mouth, Proceedings of ACM Conference on Human Factors in Computing Systems (CHI1995), Vancouver, Canada, 210-217.
  26. Tibshirani, R. (1996), Regression shrinkage and selection via the lasso, Journal of Royal Statistical Society Series B: Methodological, 58(1), 267-288.
  27. Zou, H. and Hastie, T. (2005), Regularization and variable selection via the elastic net, Journal of Royal Statistical Society Series B: Statistical Methodology, 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

Cited by

  1. Variable selection for collaborative filtering with market basket data pp.09696016, 2018, https://doi.org/10.1111/itor.12518
  2. Further Improvement on Two-Way Cooperative Collaborative Filtering Approaches for the Binary Market Basket Data vol.11, pp.19, 2021, https://doi.org/10.3390/app11198977