Data BILuring Method for Solving Sparseness Problem in Collaborative Filtering

협동적 여과에서의 희소성 문제 해결을 위한 데이타 블러링 기법

  • 김형일 (동국대학교 컴퓨터공학과) ;
  • 김준태 (동국대학교 컴퓨터공학과)
  • Published : 2005.06.01

Abstract

Recommendation systems analyze user preferences and recommend items to a user by predicting the user's preference for those items. Among various kinds of recommendation methods, collaborative filtering(CF) has been widely used and successfully applied to practical applications. However, collaborative filtering has two inherent problems: data sparseness and the cold-start problems. If there are few known preferences for a user, it is difficult to find many similar users, and therefore the performance of recommendation is degraded. This problem is more serious when a new user is first using the system. In this paper we propose a method of integrating additional feature information of users and items into CF to overcome the difficulties caused by sparseness and improve the accuracy of recommendation. In our method, we first fill in unknown preference values by using the probability distribution of feature values, then generate the top-N recommendations by applying collaborative filtering on the modified data. We call this method of filling unknown preference values as data blurring. Several experimental results that show the effectiveness of the proposed method are also presented.

추천 시스템은 사용자의 선호도를 분석하고, 아이템에 대한 사용자의 선호도를 예측하여 아이템을 추천하는 시스템이다. 다양한 추천 기법 중에 협동적 여과(collaborative filtering)는 상용화된 시스템에성공적인 적용이 이루어진 기법이다. 그러나 협동적 여과는 데이타의 희소성 문제(sparseness problem)와초기 추천 문제(cold-start problem)에 대해 취약점을 가 고 있다. 만약 매우 적은 양외 선호도 데이타가존재하면 많은 유사 사용자를 찾기 어려우며, 이것은 추천 성능을 저하시키는 요인으로 작용한다. 또한 선호도 정보가 없는 새로운 사용자에게는 아이템을 전혀 추천할 수 없는 문제가 발생한다. 본 논문에서는 사용자와 아이템에 대한 추가 속성 정보를 통합하여 협동적 여과의 희소성 문제와 초기 추천 문제를 해결하 고 추천 성능을 향상시키는 기법을 제안한다. 본 논문에서 제안하는 기법은 추가 속성 정보의 확률분포를 이용하여 알려지지 않은 선호도 값을 예측함으로써 선호도 데이타를 변경 고, 변경된 선호도 데이타에 협동적 여과를 적용하여 top-N 추천을 생성하는 것이다. 이와 같은 선호도 데이타 변경 기법을 데이타 블러링(data blurring)이라 한다. 몇 가지 실험 결과를 통해 제안된 기법의 효과를 확인하였다.

Keywords

References

  1. R. Armstrong, D. Freitag, T. Joahims, and T. Mitchell, 'WebWatcher: A Learning Apprentice for the World Wide Web,' Proceedings of the 12th National conference on Artificial Intelligence, 1995
  2. M. Balabanovic, and Y. Shoham, 'Fab : Content-Based Collaborative Recommender,' Recommendation Communications of the ACM, Vol.40, No.3, pp.66-77, 1997 https://doi.org/10.1145/245108.245124
  3. H. Lieberman, 'Letizia : An Agent That Assists Web Browsing,' Proceedings of the 14th International Joint Conference on Artificial Intelligence, 1995
  4. M. Pazzani, J. Muramatsu, and D. Billsus, 'Syskill & Webert: Identifying interesting web sites,' Proceedings of the 13th National Conference on Artificial Intelligence, 1996
  5. B. Krulwich, 'Lifestyle Finder: Intelligent user profiling using large-scale demographic data,' Artificial Intelligence Magazine, Vol.18, No.2, 1997
  6. D. Billsus and M. J. Pazzani, 'Learning Collaborative Information Filters,' Proceedings of the 15th International Conference on Machine Learning, Wisconsin, 1998
  7. J. Herlocker, J. Konstan, A. Borchers, and J. Riedl, 'An Algorithmic Framework for Performing Collaborative Filtering,' In Proceedings of ACM SiGIR-99, 1999 https://doi.org/10.1145/312624.312682
  8. J. Konstan, B. Millr, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl, 'GroupLens: Applying Collaborative Filtering to Usenet News,' Communications of the ACM, Vol.40, No.3, pp.77-87, 1997 https://doi.org/10.1145/245108.245126
  9. U. Shardanand and P. Maes, 'Social Information Filtering: Algorithms for Automating 'Word of Mouth',' Proceedings of the Conference of Human Factors in Computing Systems, 1995
  10. L. Terveen, W. Hill, B. Amenta, D. McDonald, and J. Creter, 'PHOAKS: A System for Sharing Recommendations,' Communications of the ACM, march 1997 https://doi.org/10.1145/245108.245122
  11. J. Breese, D. Heckerman, and C. Kadie, 'Empirical Analysis of Predictive Algorithms for Collaborative Filtering,' Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, 1998
  12. C. Basu, H. Hirsh, and W. Cohen, 'Recommendation as Classification: Using Social and Content-Based Information in Recommendation,' Proceedings of the 15th National Conference on Artificial Intelligence, 1998
  13. M. Claypool, A. Gokhale, T. Miranda, P. Murnikov, D. Netes and M. Sartin, 'Conbining Content-Based and collaborative Filters in an Online Newspaper,' Proceedings of the ACM SIGIR Workshop on Recommender Systems, 1999
  14. M. Pazzani, 'A Framework for Collaborative, Content-based and Demographic Filtering,' Artificial Intelligence Review, pp.393-408, 1999 https://doi.org/10.1023/A:1006544522159
  15. M. Condliff, D. Lewis, D. Madigan, and C. Posse, 'Baysian Mixed-Effect Models for Recommender Systems,' Proceedings of Recommender Systems Workshop at SIGIR-99, 1999
  16. A. Poposcul, L. Ungar, D. Pennock, and S. Lawrence, 'Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments,' Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence, 2001
  17. N. Good, J. B. Shafer, J. A. Konstan, A. Borchers, B. Sarwar, J. Herlocker, and J. Riedl, 'Combining Collaborative Filtering with Personal Agents for Better Recommendations,' Proceedings of the Sixteenth National Conference on Artificial Intelligence, 1999
  18. P. Melville, R. Mooney, and R. Nagarajan, 'Content-Boosted Collaborative Filtering,' Proceedings of the SIGIR-2001 Workshop on Recommender Systems, 2001
  19. Hyungil Kim, Juntae Kim, and J. L. Herlocker, 'Feature-Based Prediction of Unknown Preferences for Nearest-Neighbor Collaborative Filtering,' Proceedings of the 4th IEEE International Conference on Data Mining, 2004 https://doi.org/10.1109/ICDM.2004.10071
  20. J. Ruker and M. J. Polanco, 'Siteseer: Personalized Navigation for The Web,' Communications of the ACM, Vol.40, No.3, 1997 https://doi.org/10.1145/245108.245125
  21. J. Schafer, J. Konstan, and J. Riedl, 'Recommender System in E-Commerce,' Proceedings of the ACM Conference on Electronic Commerce, 1999 https://doi.org/10.1145/336992.337035