Browse > Article

Performance Analysis of Similarity Reflecting Jaccard Index for Solving Data Sparsity in Collaborative Filtering  

Lee, Soojung (경인교육대학교)
Publication Information
The Journal of Korean Association of Computer Education / v.19, no.4, 2016 , pp. 59-66 More about this Journal
Abstract
It has been studied to reflect the number of co-rated items for solving data sparsity problem in collaborative filtering systems. A well-known method of Jaccard index allowed performance improvement, when combined with previous similarity measures. However, the degree of performance improvement when combined with existing similarity measures in various data environments are seldom analyzed, which is the objective of this study. Jaccard index as a sole similarity measure yielded much higher prediction quality than traditional measures and very high recommendation quality in a sparse dataset. In general, previous similarity measures combined with Jaccard index improved performance regardless of dataset characteristics. Especially, cosine similarity achieved the highest improvement in sparse datasets, while similarity of Mean Squared Difference degraded prediction quality in denser sets. Therefore, one needs to consider characteristics of data environment and similarity measures before combining Jaccard index for similarity use.
Keywords
Recommender System; Collaborative Filtering; Similarity Measure; Jaccard index;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Su, X., & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009, 4.
2 Aamir, M. & Bhusry, M. (2015). Recommendation system: state of the art approach. International Journal of Computer Applications, 120(12), 25-32.   DOI
3 Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge & Data Engineering, 17(6), 734-749.   DOI
4 Lee, S. (2015). A strategy for neighborhood selection in collaborative filtering-based recommender systems. Journal of KIISE, 42(11), 1380-138.   DOI
5 Resnick, P. et al. (1994). GroupLens: an open architecture for collaborative filtering of Netnews. Proc. of the ACM Conf. Computer Supported Cooperative Work, 175-186.
6 Ahn, H. (2008). A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem. Information Sciences, 178(1), 37-51.   DOI
7 Chen, C. C., Wan, Y.-H., Chung, M.-C., & Sun, Y.-C. (2013). An effective recommendation method for cold start new users using trust and distrust networks. Information Sciences, 224, 19-36.   DOI
8 Bobadilla, J., Ortega, F., Hernando, A., & Bernal, J. (2012). A collaborative filtering approach to mitigate the new user cold start problem. Knowledge-Based Systems, 26, 225-238.
9 Liu, H., Hu, Z., Mian, A., Tian, H., & Zhu, X. (2014). A new user similarity model to improve the accuracy of collaborative filtering. Knowledge-Based Systems, 56, 156-166.
10 Jamali, M., & Ester, M. (2009). TrustWalker: a random walk model for combining trust-based and item-based recommendation. Prococeedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 397-406.
11 Bobadilla, J., Serradilla, F., & Bernal. J. (2010). A new collaborative filtering metric that improves the behavior of recommender systems. Knowledge-Based Systems 23, 520-528.   DOI
12 Sanchez, J. L., Serradilla, F., Martinez, E., & Bobadilla, J. (2008). Choice of metrics used in collaborative filtering and their impact on recommender systems. Proceedings of the IEEE International Conference on Digital Ecosystems and Technologies, 432-436.
13 Koutrica, G., Bercovitz, B., & Garcia, H. (2009). FlexRecs: expresing and combining flexible recommendations. Proc. of the ACM SIGMOD Int'l Conf. on Management of data, 745-758.
14 Gao, M., Wu, Z., & Jiang, F. (2011). Userrank for item-based collaborative filtering recommendation. Information Processing Letters, 111(9), 440-446.   DOI