Search | Korea Science

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

Kim, Minsung;Im, Il
- Journal of Intelligence and Information Systems
- /
- v.20 no.2
- /
- pp.137-148
- /
- 2014

Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

. Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

https://doi.org/10.13088/jiis.2014.20.2.137 인용 PDF KSCI

A Method to Decide the Number of Additional Edges and Their Locations to Integrate the Communities by Using Fitness Function (적합도 함수를 이용한 커뮤니티 통합에 필요한 추가에지수 결정 및 위치 선정 방법)

Jun, Byung-Hyun;Lee, Sang-Hoon;Han, Chi-Geun
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.12
- /
- pp.239-246
- /
- 2014
In this paper, we propose a method to decide the additional edges in order to integrate two communitites A,B($${\mid}A{\mid}{\geq_-}{\mid}B{\mid}$$, ${\mid}{\cdot}{\mid}$ is the size of the set). The proposed algorithm uses a fitness function that shows the property of a community and the fitness function is defined by the number of edges which exist in the community and connect two nodes, one is in the community and the other is out of the community. The community has a strong property when the function has a large value. The proposed algorithm is a kind of greedy method and when a node of B is merged to A, the minimum number of additional edges is decided to increase the fitness function value of A. After determining the number of additional edges, we define the community connectivity measures using the node centrality to determine the edges locations. The connections of the new edges are fixed to maximize the connectivity measure of the combined community. The procedure is applied for all nodes in B to integrate A and B. The effectiveness of the proposed algorithm is shown by solving the Zachary Karate Club network.
https://doi.org/10.9708/jksci.2014.19.12.239 인용 PDF KSCI

A Study on the Factors Influencing Semantic Relation in Building a Structured Glossary (구조적 학술용어사전 데이터베이스 구축에 있어서 용어의 의미관계 형성에 영향을 미치는 요인에 관한 연구)

Kwon, Sun-Young
- Journal of the Korean Society for Library and Information Science
- /
- v.48 no.2
- /
- pp.353-378
- /
- 2014
The purpose of this study is to find factors to affect on the formation of semantic relation from terminology and what is to be affected by these factors to build the database scheme of terminology dictionary by a structural definition. In this research, 826,905 keywords of 88,874 social science articles and 985,580 keywords of 125,046 humanities science articles in the KCI journals from 2007 to 2011 were collected. From collected data, subject complexity, structural hole, term frequency, occurrence pattern and an effect between the number of nodes and the number of patterns which were derived from the semantic relation of linked terms of established 'STNet' System were analyzed. The summarized results from analyzed data and network patterns are as follows. Betweenness Centrality, term frequency, and effective size affect the numbers of semantic relation node. Among these factors, betweenness centrality was the most effective and effective size. But term frequency was the least effective. Betweenness Centrality, term frequency, and effective size affect the numbers of semantic relation type. Term frequency is the most effective. Therefore, when building a terminology dictionary, factors of betweenness centrality, term frequency, effective size, and complexity of subject are needed to select term. As a result, these factors can be expected to improve the quality of terminology dictionary.
https://doi.org/10.4275/KSLIS.2014.48.2.353 인용 PDF KSCI

The Role of Content Services Within a Firm's Internet Service Portfolio: Case Studies of Naver Webtoon and Google YouTube (기업의 인터넷 서비스 포트폴리오 내 콘텐츠 서비스의 역할: 네이버 웹툰과 구글 유튜브의 사례 연구)

Choi, Jiwon;Cho, Wooje;Jung, Yoonhyuk;Kwon, YoungOk
- Journal of Intelligence and Information Systems
- /
- v.28 no.1
- /
- pp.1-28
- /
- 2022
In recent years, many Internet giants have begun providing their own content services, which attract online users by offering personalized services based on artificial intelligence technologies. This study investigates the role of two firms' content services within the firms' online service network. We examine the role of Naver Webtoon, which can be characterized as a professional-generated content, within Naver's service portfolio, and that of Google YouTube, which can be characterized as a user-generated content, within Google's service portfolio. Using survey data on viewers' use of the two services, we analyze a valued directed service network, where a node denotes an online service and a relationship between two nodes denotes a sequential use of two services. We found that both Webtoon and YouTube show higher out-degree centrality than in-degree centrality, which implies these content services are more likely to be starting services rather than arriving services within the firms' interactive network. The gap between the out-degree and in-degree centrality of YouTube is much smaller than that of Webtoon. The high centrality of YouTube, a user-generated content service, within the Google service network shows that YouTube's initial role of providing specific-content videos (e.g., entertainment) has expanded into a general search service for users.
https://doi.org/10.13088/jiis.2022.28.1.001 인용 PDF KSCI

Product Community Analysis Using Opinion Mining and Network Analysis: Movie Performance Prediction Case (오피니언 마이닝과 네트워크 분석을 활용한 상품 커뮤니티 분석: 영화 흥행성과 예측 사례)

Jin, Yu;Kim, Jungsoo;Kim, Jongwoo
- Journal of Intelligence and Information Systems
- /
- v.20 no.1
- /
- pp.49-65
- /
- 2014
Word of Mouth (WOM) is a behavior used by consumers to transfer or communicate their product or service experience to other consumers. Due to the popularity of social media such as Facebook, Twitter, blogs, and online communities, electronic WOM (e-WOM) has become important to the success of products or services. As a result, most enterprises pay close attention to e-WOM for their products or services. This is especially important for movies, as these are experiential products. This paper aims to identify the network factors of an online movie community that impact box office revenue using social network analysis. In addition to traditional WOM factors (volume and valence of WOM), network centrality measures of the online community are included as influential factors in box office revenue. Based on previous research results, we develop five hypotheses on the relationships between potential influential factors (WOM volume, WOM valence, degree centrality, betweenness centrality, closeness centrality) and box office revenue. The first hypothesis is that the accumulated volume of WOM in online product communities is positively related to the total revenue of movies. The second hypothesis is that the accumulated valence of WOM in online product communities is positively related to the total revenue of movies. The third hypothesis is that the average of degree centralities of reviewers in online product communities is positively related to the total revenue of movies. The fourth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. The fifth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. To verify our research model, we collect movie review data from the Internet Movie Database (IMDb), which is a representative online movie community, and movie revenue data from the Box-Office-Mojo website. The movies in this analysis include weekly top-10 movies from September 1, 2012, to September 1, 2013, with in total. We collect movie metadata such as screening periods and user ratings; and community data in IMDb including reviewer identification, review content, review times, responder identification, reply content, reply times, and reply relationships. For the same period, the revenue data from Box-Office-Mojo is collected on a weekly basis. Movie community networks are constructed based on reply relationships between reviewers. Using a social network analysis tool, NodeXL, we calculate the averages of three centralities including degree, betweenness, and closeness centrality for each movie. Correlation analysis of focal variables and the dependent variable (final revenue) shows that three centrality measures are highly correlated, prompting us to perform multiple regressions separately with each centrality measure. Consistent with previous research results, our regression analysis results show that the volume and valence of WOM are positively related to the final box office revenue of movies. Moreover, the averages of betweenness centralities from initial community networks impact the final movie revenues. However, both of the averages of degree centralities and closeness centralities do not influence final movie performance. Based on the regression results, three hypotheses, 1, 2, and 4, are accepted, and two hypotheses, 3 and 5, are rejected. This study tries to link the network structure of e-WOM on online product communities with the product's performance. Based on the analysis of a real online movie community, the results show that online community network structures can work as a predictor of movie performance. The results show that the betweenness centralities of the reviewer community are critical for the prediction of movie performance. However, degree centralities and closeness centralities do not influence movie performance. As future research topics, similar analyses are required for other product categories such as electronic goods and online content to generalize the study results.
https://doi.org/10.13088/jiis.2014.20.1.049 인용 PDF KSCI

Social Network Analysis and Its Applications for Authors and Keywords in the JKSS

Kim, Jong-Goen;Choi, Soon-Kuek;Choi, Yong-Seok
- Communications for Statistical Applications and Methods
- /
- v.19 no.4
- /
- pp.547-558
- /
- 2012
Social network analysis is a graphical technique to search the relationships and characteristics of nodes (people, companies, and organizations) and an important node for positioning a visualized social network figure; however, it is difficult to characterize nodes in a social network figure. Therefore, their relationships and characteristics could be presented through an application of correspondence analysis to an affiliation matrix that is a type of similarity matrix between nodes. In this study, we provide the relationships and characteristics around authors and keywords in the JKSS(Journal of the Korean Statistical Society) of the Korean Statistical Society through the use of social network analysis and correspondence analysis.
https://doi.org/10.5351/CKSS.2012.19.4.547 인용 PDF KSCI

The Distinct Impact Dimensions of the Prestige Indices in Author Citation Networks (저자 인용 네트워크에서 명망성 지표의 차별된 영향력 측정기준에 관한 연구)

Ahn, Hyerim;Park, Ji-Hong
- Journal of the Korean Society for information Management
- /
- v.33 no.2
- /
- pp.61-76
- /
- 2016
This study aims at proposing three prestige indices-closeness prestige, input domain, and proximity prestige- as useful measures for the impact of a particular node in citation networks. It compares these prestige indices with other impact indices as it is still unknown what dimensions of impact these indices actually measure. The prestige indices enable us to distinguish the most prominent actors in a directed network, similar to the centrality indices in undirected networks. Correlation analysis and principal component analysis were conducted on the author citation network to identify the differentiated implications of the three prestige indices from the existing impact indices. We selected simple citation counting, h-index, PageRank, and the three kinds of centrality indices which assume undirected networks as the existing impact measures for comparison with the three prestige indices. The results indicate that these prestige indices demonstrate distinct impact dimension from the other impact indices. The prestige indices reflect indirect impact while the others direct impact.
https://doi.org/10.3743/KOSIM.2016.33.2.061 인용 PDF KSCI

Keyword Network Analysis about the Trends of Social Welfare Researches - focused on the papers of KJSW during 1979~2015 - (사회복지학 연구동향에 관한 키워드 네트워크 분석 - ｢한국사회복지학｣ 게재논문(1979-2015)을 중심으로 -)

Kam, Jeong Ki;Kam, Mi Ah;Park, Mi Hee
- Korean Journal of Social Welfare
- /
- v.68 no.2
- /
- pp.185-211
- /
- 2016
This study analyzes key word networks of the papers which are published at Korean Journal of Social Welfare issued by Korean Academy of Social Welfare from 1979 to 2015. It aims at investigating the trends of social welfare researches in Korea by dividing the given period into two: 1979-2000 and 2001-2015. It shows the trends in three ways: methodologies, subjects, and intellectual structures. In order to identify intellectual structure, it calculate centrality indices basing on co-appearance frequency of key words. It also derives some values which explain relationship structure of key words by using pathfinder algorithm, and finally visualizes the intellectual structures by using the NodeXL program. Some implications of the findings of these analyses are discussed in the end.
PDF

Exploration of Emotional Labor Research Trends in Korea through Keyword Network Analysis (주제어 네트워크 분석(network analysis)을 통한 국내 감정노동의 연구동향 탐색)

Lee, Namyeon;Kim, Joon-Hwan;Mun, Hyung-Jin
- Journal of Convergence for Information Technology
- /
- v.9 no.3
- /
- pp.68-74
- /
- 2019
The purpose of this study was to identify research trends of 892 domestic articles (2009-2018) related to emotional labor by using text-mining and network analysis. To this end, the keyword of these papers were collected and coded and eventually converted to 871 nodes and 2625 links for network text analysis. First, network text analysis revealed that the top four main keyword, according to co-occurrence frequency, were burnout, turnover intention, job stress, and job satisfaction in order and that the frequency and the top four core keyword by degree centrality were all relatively the high. Second, based on the top four core keyword of degree centrality the ego network analysis was conducted and the keyword for connection centroid of each network were presented.
https://doi.org/10.22156/CS4SMB.2019.9.3.068 인용 PDF KSCI HTML

맨앞
이전
1
2
3
4
5현재
다음
맨뒤
5 / 7 pages

Search Result 66, Processing Time 0.027 seconds

Development of Modeling to Find the Hub Nodes on Growing Scale-free Network based on Stochastic Community Bridge Node Finder (확장하는 Scale-free 네트워크에서의 허브노드 도출을 위한 Stochastic Community Bridge Node Finder 개발)

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

A Method to Decide the Number of Additional Edges and Their Locations to Integrate the Communities by Using Fitness Function (적합도 함수를 이용한 커뮤니티 통합에 필요한 추가에지수 결정 및 위치 선정 방법)

A Study on the Factors Influencing Semantic Relation in Building a Structured Glossary (구조적 학술용어사전 데이터베이스 구축에 있어서 용어의 의미관계 형성에 영향을 미치는 요인에 관한 연구)

The Role of Content Services Within a Firm's Internet Service Portfolio: Case Studies of Naver Webtoon and Google YouTube (기업의 인터넷 서비스 포트폴리오 내 콘텐츠 서비스의 역할: 네이버 웹툰과 구글 유튜브의 사례 연구)

Product Community Analysis Using Opinion Mining and Network Analysis: Movie Performance Prediction Case (오피니언 마이닝과 네트워크 분석을 활용한 상품 커뮤니티 분석: 영화 흥행성과 예측 사례)

Social Network Analysis and Its Applications for Authors and Keywords in the JKSS

The Distinct Impact Dimensions of the Prestige Indices in Author Citation Networks (저자 인용 네트워크에서 명망성 지표의 차별된 영향력 측정기준에 관한 연구)

Keyword Network Analysis about the Trends of Social Welfare Researches - focused on the papers of KJSW during 1979~2015 - (사회복지학 연구동향에 관한 키워드 네트워크 분석 - ｢한국사회복지학｣ 게재논문(1979-2015)을 중심으로 -)

Exploration of Emotional Labor Research Trends in Korea through Keyword Network Analysis (주제어 네트워크 분석(network analysis)을 통한 국내 감정노동의 연구동향 탐색)

Search Result 66, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)