DOI QR코드

DOI QR Code

Analysis of Data Imputation in Recommender Systems

추천 시스템에서의 데이터 임퓨테이션 분석

  • 이영남 (한양대학교 컴퓨터소프트웨어학부) ;
  • 김상욱 (한양대학교 컴퓨터소프트웨어학부)
  • Received : 2017.05.31
  • Accepted : 2017.10.24
  • Published : 2017.12.15

Abstract

Recommender systems (RS) that predict a set of items a target user is likely to prefer have been extensively studied in academia and have been aggressively implemented by many companies such as Google, Netflix, eBay, and Amazon. Data imputation alleviates the data sparsity problem occurring in recommender systems by inferring missing ratings and adding them to the original data. In this paper, we point out the drawbacks of existing approaches and make suggestions for data imputation techniques. We also justify our suggestions through extensive experiments.

추천 시스템이란 사용자가 좋아할만한 개인화된 상품을 사용자에게 제안하는 것이다. 최근 상품 수의 증가로 추천 시스템의 중요성이 날로 커지고 있지만, 데이터 희소성 문제는 여전히 추천 시스템의 대표적인 문제로 남아있다. 데이터 희소성 문제는 사용자가 전체 상품 중 일부의 상품에만 평점을 부여하여, 사용자와 상품 관계를 정확히 이해하기 힘든 것을 말한다. 이를 해결하기 위해 가장 여러 가지 접근법이 있는 그 중 대표적인 것인 데이터 임퓨테이션이다. 데이터 임퓨테이션은 사용자가 평가하지 않은 상품의 평점을 추론해 평점 행렬에 채우는 방법이다. 하지만 기존 데이터 임퓨테이션 방법은 사용자가 평가하지 않은 상품에 대한 몇 가지 특성을 놓치고 있다. 본 논문에서는 기존 방법의 한계점을 정의하고, 이를 개선하는 방안 3가지를 제안한다.

Keywords

Acknowledgement

Supported by : 한국연구재단, 정보통신기술진흥센터

References

  1. Jongwuk Lee, Dongwon Lee, Yeon-Chang Lee, Won-Seok Hwang, and Sang-Wook Kim. 2016. Improving the accuracy of top-N recommendation using a preference model. Information Sciences 348 (2016), pp. 290-304. https://doi.org/10.1016/j.ins.2016.02.005
  2. Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model. Proc. of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp. 426-434.
  3. Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. 2008. Sorec: social recommendation using probabilistic matrix factorization. Proc. of the 17th ACM conference on Information and knowledge management. ACM, pp. 931-940.
  4. Won-Seok Hwang, Juan Parc, Sang-Wook Kim, Jongwuk Lee, and Dongwon Lee. 2016. "Told You I Didn't Like It": Exploiting Uninteresting Items for Effective Collaborative Filtering. 2016 IEEE 32nd International Conference on Data Engineering. IEEE, pp. 349-360.
  5. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. Proc. of the 10th international conference on World Wide Web. ACM, pp. 285-295.
  6. Harald Steck. 2010. Training and testing of recommender systems on data missing not at random. Proc. of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp. 713-722.
  7. Mohsen Jamali and Martin Ester. 2009. TrustWalker: a random walk model for combining trust-based and item-based recommendation. Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp. 397-406.
  8. Won-Seok Hwang, Shaoyu Li, Sang-Wook Kim, and Ho Jin Choi. 2013. Exploiting trustors as well as trustees in trust-based recommendation. Proc. of the 22nd ACM international conference on Conference on information & knowledge management. ACM, pp. 1893-1896.
  9. Jiwoon Ha, Soon-Hyoung Kwon, Sang-Wook Kim, Christos Faloutsos, and Sunju Park. 2012. Top-N recommendation through belief propagation. Proc. of the 21st ACM international conference on Information and knowledge management. ACM, pp. 2343-2346.
  10. Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. Proc. of the fourth ACM conference on Recommender systems. ACM, pp. 39-46.
  11. Shuo Chang, F Maxwell Harper, Lingfei He, and Loren G Terveen. 2016. CrowdLens: Experimenting with Crowd-Powered Recommendation and Explanation. 10th International AAAI Conference on Web and Social Media.
  12. Hao Ma, Irwin King, and Michael R Lyu. 2007. Effective missing data prediction for collaborative filtering. Proc. of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp. 39-46.