Browse > Article
http://dx.doi.org/10.13088/jiis.2014.20.2.137

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems  

Kim, Minsung (School of Business, Yonsei University)
Im, Il (School of Business, Yonsei University)
Publication Information
Journal of Intelligence and Information Systems / v.20, no.2, 2014 , pp. 137-148 More about this Journal
Abstract
Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used . Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.
Keywords
Collaborative filtering (CF); Social Network Analysis (SNA); Gray Sheep Problem;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Dorogovtsev, S. N. and J. F. F. Mendes, "Evolution of Networks," Advances in Physics, Vol.51, No.4 (2002), 1079-1187.   DOI   ScienceOn
2 Ahn, S. M., I. H. Kim, B. G. Choi, Y. H. Cho, E. H. Kim, and M. Y. Kim, "Understanding the Performance of Collaborative Filtering Recommendation through Social Network Analysis" Society for e-Business Studies, Vol.17, No.2(2012), 129-147.   과학기술학회마을   DOI   ScienceOn
3 Barragans-Martinez, A. B., E. Costa-Montenegro, J. C. Burguillo, M. Rey-Lopez, F. A. Mikic-Fonte and A. Peleteiro, "A Hybrid Content-Based and Item-Based Collaborative Filtering Approach to Recommend Tv Programs Enhanced with Singular Value Decomposition," Information Sciences, Vol.180, No.22(2010), 4290-4311.   DOI   ScienceOn
4 Claypool, M., A. Gokhale, T. Miranda, P. Murnikov, D. Netes, and M. Sartin. "Combining Content-Based and Collaborative Filters in an Online Newspaper," Proceedings of ACM SIGIR workshop on recommender systems, (1999).
5 Ghazanfar, M. A. and A. Prügel-Bennett, "Leveraging Clustering Approaches to Solve the Gray-Sheep Users Problem in Recommender Systems," Expert Systems with Applications, Vol.41, No.7 (2014): 3261-3275.   DOI   ScienceOn
6 Goldberg, D., D. Nichols, B. M. Oki, and D. Terry, "Using Collaborative Filtering to Weave an Information Tapestry," Communications of the Acm, Vol.35, No.12(1992), 61-70.
7 Herlocker, J. L., J. A. Konstan, L. G. Terveen and J. T. Riedl, "Evaluating collaborative filtering recommender systems," ACM Transactions on Information Systems (TOIS), Vol.22, No.1(2004), 5-53.   DOI   ScienceOn
8 Hung, L. P., "A Personalized Recommendation System Based on Product Taxonomy for One-to-One Marketing Online," Expert Systems with Applications, Vol.29, No.2(2005), 383-392.   DOI   ScienceOn
9 Im, I. and A. Hars, "Does a One-Size Recommendation System Fit All? The Effectiveness of Collaborative Filtering Based Recommendation Systems across Different Domains and Search Modes," ACM Transactions on Information Systems (TOIS), Vol.26, No.1(2007).
10 Kim, H. K., J. K. Kim, and Q.-Y. Chen, "A Network Approach to Derive Product Relations and Analyze Topological Characteristics", Journal of Intelligence and Information Systems, Vol.15, No.4 (2009), 159-182.   과학기술학회마을
11 Kim, Y. H., Social Network Analysis, Pakyoungsa, Seoul, 2013.
12 Konstan, J. A., B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl, "Grouplens: Applying Collaborative Filtering to Usenet News," Communications of the ACM, Vol.40, No.3(1997), 77-87.   DOI   ScienceOn
13 Konstan, J. A., J. Riedl, A. Borchers, and J. L. Herlocker, "Recommender Systems: A Grouplens Perspective," In Recommender Systems: Papers from the 1998 Workshop (AAAI Technical Report WS-98-08), (1998), 60-64.
14 Lee, Y. J., S. H. Lee,and C. J. Wang, "Improving Sparsity Problem of Collaborative Filtering in Educational Contents Recommendation System,", Proceedings of Korea Information Science Society, Vol.30, No.1(A)(2003), 830-832.
15 Newman, M. E. J., "The Structure and Function of Complex Networks," Siam Review, Vol.45, No.2 (2003), 167-256.   DOI   ScienceOn
16 Park, J. H. and Y. H. Cho, "Social Network Analysis for the Effective Adoption of Recommender Systems," Journal of Intelligence and Information Systems, Vol.17, No.4(2011), 305-316.   과학기술학회마을
17 Pham, M. C., Y. Cao, R. Klamma, and M. Jarke, "A Clustering Approach for Collaborative Filtering Recommendation Using Social Network Analysis," Journal of Universal Computer Science, Vol.17, No.4(2011), 583-604.
18 Sohn, D. W., Social Network Analysis, Kyungmoon publishers, Seoul, 2008.
19 Sarwar, B., G. Karypis, J. Konstan, and J. Riedl, "Analysis of Recommendation Algorithms for E-Commerce," Proceedings of the 2nd ACM conference on Electronic commerce, (2000), 158-167.
20 Shin, C. H., J. W. Lee, H. N. Yang, and I. Y. Choi, "The Research on Recommender for New Customers Using Collaborative Filtering and Social Network Analysis,", Journal of Intelligence and Information Systems, Vol.18, No.4(2012), 19-42.   과학기술학회마을
21 Su, X. and T. M. Khoshgoftaar, "Collaborative Filtering for Multi-Class Data Using Bayesian Networks," International Journal on Artificial Intelligence Tools, Vol.17, No.1(2008), 71-85.   DOI   ScienceOn
22 Herlocker, J. L., J. A. Konstan, and J. T. Riedl, "An Empirical Analysis of Design Choices in Neighborhood-Based Collaborative Filtering Algorithms," Information Retrieval, Vol.5, No.4 (2002), 287-310.   DOI   ScienceOn