• Title/Summary/Keyword: Online social recommendation

Search Result 58, Processing Time 0.021 seconds

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

The Research on Recommender for New Customers Using Collaborative Filtering and Social Network Analysis (협력필터링과 사회연결망을 이용한 신규고객 추천방법에 대한 연구)

  • Shin, Chang-Hoon;Lee, Ji-Won;Yang, Han-Na;Choi, Il Young
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.19-42
    • /
    • 2012
  • Consumer consumption patterns are shifting rapidly as buyers migrate from offline markets to e-commerce routes, such as shopping channels on TV and internet shopping malls. In the offline markets consumers go shopping, see the shopping items, and choose from them. Recently consumers tend towards buying at shopping sites free from time and place. However, as e-commerce markets continue to expand, customers are complaining that it is becoming a bigger hassle to shop online. In the online shopping, shoppers have very limited information on the products. The delivered products can be different from what they have wanted. This case results to purchase cancellation. Because these things happen frequently, they are likely to refer to the consumer reviews and companies should be concerned about consumer's voice. E-commerce is a very important marketing tool for suppliers. It can recommend products to customers and connect them directly with suppliers with just a click of a button. The recommender system is being studied in various ways. Some of the more prominent ones include recommendation based on best-seller and demographics, contents filtering, and collaborative filtering. However, these systems all share two weaknesses : they cannot recommend products to consumers on a personal level, and they cannot recommend products to new consumers with no buying history. To fix these problems, we can use the information which has been collected from the questionnaires about their demographics and preference ratings. But, consumers feel these questionnaires are a burden and are unlikely to provide correct information. This study investigates combining collaborative filtering with the centrality of social network analysis. This centrality measure provides the information to infer the preference of new consumers from the shopping history of existing and previous ones. While the past researches had focused on the existing consumers with similar shopping patterns, this study tried to improve the accuracy of recommendation with all shopping information, which included not only similar shopping patterns but also dissimilar ones. Data used in this study, Movie Lens' data, was made by Group Lens research Project Team at University of Minnesota to recommend movies with a collaborative filtering technique. This data was built from the questionnaires of 943 respondents which gave the information on the preference ratings on 1,684 movies. Total data of 100,000 was organized by time, with initial data of 50,000 being existing customers and the latter 50,000 being new customers. The proposed recommender system consists of three systems : [+] group recommender system, [-] group recommender system, and integrated recommender system. [+] group recommender system looks at customers with similar buying patterns as 'neighbors', whereas [-] group recommender system looks at customers with opposite buying patterns as 'contraries'. Integrated recommender system uses both of the aforementioned recommender systems to recommend movies that both recommender systems pick. The study of three systems allows us to find the most suitable recommender system that will optimize accuracy and customer satisfaction. Our analysis showed that integrated recommender system is the best solution among the three systems studied, followed by [-] group recommended system and [+] group recommender system. This result conforms to the intuition that the accuracy of recommendation can be improved using all the relevant information. We provided contour maps and graphs to easily compare the accuracy of each recommender system. Although we saw improvement on accuracy with the integrated recommender system, we must remember that this research is based on static data with no live customers. In other words, consumers did not see the movies actually recommended from the system. Also, this recommendation system may not work well with products other than movies. Thus, it is important to note that recommendation systems need particular calibration for specific product/customer types.

Internet Shopping in Japan: Shopping motivation, Perceived Risks, and Innovativeness (일본의 인터넷 쇼핑 실태에 관한 연구: 쇼핑동기, 지각위험, 혁신성을 중심으로)

  • Park, Cheol;Kang, You Rie
    • Asia-Pacific Journal of Business
    • /
    • v.2 no.1
    • /
    • pp.91-114
    • /
    • 2011
  • The market size of e-Commerce in Japan was 15 trillion Yen in 2006, and B2C Internet shopping sales were over 6.57 trillion in 2009. Lakuten is a representative Internet shopping company whose market share is 45%. Lakuten has over 70,000 online stores and Japanese shoppers trust them based on the fair competition rule and pre-control system on e-commerce. Japanese consumers accept new technology rapidly and highly use Internet and mobile channel. This research analyse online shopping behaviors of Japan, a big e-commerce market. Internet shopping intention, satisfaction, and recommendation by Internet shopping motivations, perceived risks, shopping innovativeness were analyzed. A questionnaire survey of 464 Japanese consumer was performed and ANOVA, factor analysis, reliability test have done by SPSS 12.0. As the results, Internet shopping intentions were higher in groups of olders, higher innovativeness. House wives' satisfaction of Internet shopping is highest. High innovativeness group showed higher internet shopping motivation of economics, connivence, hedonic, and social. Student, women, and low income group perceives high risks to Internet shopping. Implications and further researches were suggested based on the results.

  • PDF

Comparisons of Popularity- and Expert-Based News Recommendations: Similarities and Importance (인기도 기반의 온라인 추천 뉴스 기사와 전문 편집인 기반의 지면 뉴스 기사의 유사성과 중요도 비교)

  • Suh, Kil-Soo;Lee, Seongwon;Suh, Eung-Kyo;Kang, Hyebin;Lee, Seungwon;Lee, Un-Kon
    • Asia pacific journal of information systems
    • /
    • v.24 no.2
    • /
    • pp.191-210
    • /
    • 2014
  • As mobile devices that can be connected to the Internet have spread and networking has become possible whenever/wherever, the Internet has become central in the dissemination and consumption of news. Accordingly, the ways news is gathered, disseminated, and consumed have changed greatly. In the traditional news media such as magazines and newspapers, expert editors determined what events were worthy of deploying their staffs or freelancers to cover and what stories from newswires or other sources would be printed. Furthermore, they determined how these stories would be displayed in their publications in terms of page placement, space allocation, type sizes, photographs, and other graphic elements. In turn, readers-news consumers-judged the importance of news not only by its subject and content, but also through subsidiary information such as its location and how it was displayed. Their judgments reflected their acceptance of an assumption that these expert editors had the knowledge and ability not only to serve as gatekeepers in determining what news was valuable and important but also how to rank its value and importance. As such, news assembled, dispensed, and consumed in this manner can be said to be expert-based recommended news. However, in the era of Internet news, the role of expert editors as gatekeepers has been greatly diminished. Many Internet news sites offer a huge volume of news on diverse topics from many media companies, thereby eliminating in many cases the gatekeeper role of expert editors. One result has been to turn news users from passive receptacles into activists who search for news that reflects their interests or tastes. To solve the problem of an overload of information and enhance the efficiency of news users' searches, Internet news sites have introduced numerous recommendation techniques. Recommendations based on popularity constitute one of the most frequently used of these techniques. This popularity-based approach shows a list of those news items that have been read and shared by many people, based on users' behavior such as clicks, evaluations, and sharing. "most-viewed list," "most-replied list," and "real-time issue" found on news sites belong to this system. Given that collective intelligence serves as the premise of these popularity-based recommendations, popularity-based news recommendations would be considered highly important because stories that have been read and shared by many people are presumably more likely to be better than those preferred by only a few people. However, these recommendations may reflect a popularity bias because stories judged likely to be more popular have been placed where they will be most noticeable. As a result, such stories are more likely to be continuously exposed and included in popularity-based recommended news lists. Popular news stories cannot be said to be necessarily those that are most important to readers. Given that many people use popularity-based recommended news and that the popularity-based recommendation approach greatly affects patterns of news use, a review of whether popularity-based news recommendations actually reflect important news can be said to be an indispensable procedure. Therefore, in this study, popularity-based news recommendations of an Internet news portal was compared with top placements of news in printed newspapers, and news users' judgments of which stories were personally and socially important were analyzed. The study was conducted in two stages. In the first stage, content analyses were used to compare the content of the popularity-based news recommendations of an Internet news site with those of the expert-based news recommendations of printed newspapers. Five days of news stories were collected. "most-viewed list" of the Naver portal site were used as the popularity-based recommendations; the expert-based recommendations were represented by the top pieces of news from five major daily newspapers-the Chosun Ilbo, the JoongAng Ilbo, the Dong-A Daily News, the Hankyoreh Shinmun, and the Kyunghyang Shinmun. In the second stage, along with the news stories collected in the first stage, some Internet news stories and some news stories from printed newspapers that the Internet and the newspapers did not have in common were randomly extracted and used in online questionnaire surveys that asked the importance of these selected news stories. According to our analysis, only 10.81% of the popularity-based news recommendations were similar in content with the expert-based news judgments. Therefore, the content of popularity-based news recommendations appears to be quite different from the content of expert-based recommendations. The differences in importance between these two groups of news stories were analyzed, and the results indicated that whereas the two groups did not differ significantly in their recommendations of stories of personal importance, the expert-based recommendations ranked higher in social importance. This study has importance for theory in its examination of popularity-based news recommendations from the two theoretical viewpoints of collective intelligence and popularity bias and by its use of both qualitative (content analysis) and quantitative methods (questionnaires). It also sheds light on the differences in the role of media channels that fulfill an agenda-setting function and Internet news sites that treat news from the viewpoint of markets.

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

  • Kim, Minsung;Im, Il
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.137-148
    • /
    • 2014
  • Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

    . Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

  • Extension Method of Association Rules Using Social Network Analysis (사회연결망 분석을 활용한 연관규칙 확장기법)

    • Lee, Dongwon
      • Journal of Intelligence and Information Systems
      • /
      • v.23 no.4
      • /
      • pp.111-126
      • /
      • 2017
    • Recommender systems based on association rule mining significantly contribute to seller's sales by reducing consumers' time to search for products that they want. Recommendations based on the frequency of transactions such as orders can effectively screen out the products that are statistically marketable among multiple products. A product with a high possibility of sales, however, can be omitted from the recommendation if it records insufficient number of transactions at the beginning of the sale. Products missing from the associated recommendations may lose the chance of exposure to consumers, which leads to a decline in the number of transactions. In turn, diminished transactions may create a vicious circle of lost opportunity to be recommended. Thus, initial sales are likely to remain stagnant for a certain period of time. Products that are susceptible to fashion or seasonality, such as clothing, may be greatly affected. This study was aimed at expanding association rules to include into the list of recommendations those products whose initial trading frequency of transactions is low despite the possibility of high sales. The particular purpose is to predict the strength of the direct connection of two unconnected items through the properties of the paths located between them. An association between two items revealed in transactions can be interpreted as the interaction between them, which can be expressed as a link in a social network whose nodes are items. The first step calculates the centralities of the nodes in the middle of the paths that indirectly connect the two nodes without direct connection. The next step identifies the number of the paths and the shortest among them. These extracts are used as independent variables in the regression analysis to predict future connection strength between the nodes. The strength of the connection between the two nodes of the model, which is defined by the number of nodes between the two nodes, is measured after a certain period of time. The regression analysis results confirm that the number of paths between the two products, the distance of the shortest path, and the number of neighboring items connected to the products are significantly related to their potential strength. This study used actual order transaction data collected for three months from February to April in 2016 from an online commerce company. To reduce the complexity of analytics as the scale of the network grows, the analysis was performed only on miscellaneous goods. Two consecutively purchased items were chosen from each customer's transactions to obtain a pair of antecedent and consequent, which secures a link needed for constituting a social network. The direction of the link was determined in the order in which the goods were purchased. Except for the last ten days of the data collection period, the social network of associated items was built for the extraction of independent variables. The model predicts the number of links to be connected in the next ten days from the explanatory variables. Of the 5,711 previously unconnected links, 611 were newly connected for the last ten days. Through experiments, the proposed model demonstrated excellent predictions. Of the 571 links that the proposed model predicts, 269 were confirmed to have been connected. This is 4.4 times more than the average of 61, which can be found without any prediction model. This study is expected to be useful regarding industries whose new products launch quickly with short life cycles, since their exposure time is critical. Also, it can be used to detect diseases that are rarely found in the early stages of medical treatment because of the low incidence of outbreaks. Since the complexity of the social networking analysis is sensitive to the number of nodes and links that make up the network, this study was conducted in a particular category of miscellaneous goods. Future research should consider that this condition may limit the opportunity to detect unexpected associations between products belonging to different categories of classification.

    A Study on Detecting Fake Reviews Using Machine Learning: Focusing on User Behavior Analysis (머신러닝을 활용한 가짜리뷰 탐지 연구: 사용자 행동 분석을 중심으로)

    • Lee, Min Cheol;Yoon, Hyun Shik
      • Knowledge Management Research
      • /
      • v.21 no.3
      • /
      • pp.177-195
      • /
      • 2020
    • The social consciousness on fake reviews has triggered researchers to suggest ways to cope with them by analyzing contents of fake reviews or finding ways to discover them by means of structural characteristics of them. This research tried to collect data from blog posts in Naver and detect habitual patterns users use unconsciously by variables extracted from blogs and blog posts by a machine learning model and wanted to use the technique in predicting fake reviews. Data analysis showed that there was a very high relationship between the number of all the posts registered in the blog of the writer of the related writing and the date when it was registered. And, it was found that, as model to detect advertising reviews, Random Forest is the most suitable. If a review is predicted to be an advertising one by the model suggested in this research, it is very likely that it is fake review, and that it violates the guidelines on investigation into markings and advertising regarding recommendation and guarantee in the Law of Marking and Advertising. The fact that, instead of using analysis of morphemes in contents of writings, this research adopts behavior analysis of the writer, and, based on such an approach, collects characteristic data of blogs and blog posts not by manual works, but by automated system, and discerns whether a certain writing is advertising or not is expected to have positive effects on improving efficiency and effectiveness in detecting fake reviews.

    A Comparison Study of RNN, CNN, and GAN Models in Sequential Recommendation (순차적 추천에서의 RNN, CNN 및 GAN 모델 비교 연구)

    • Yoon, Ji Hyung;Chung, Jaewon;Jang, Beakcheol
      • Journal of Internet Computing and Services
      • /
      • v.23 no.4
      • /
      • pp.21-33
      • /
      • 2022
    • Recently, the recommender system has been widely used in various fields such as movies, music, online shopping, and social media, and in the meantime, the recommender model has been developed from correlation analysis through the Apriori model, which can be said to be the first-generation model in the recommender system field. In 2005, many models have been proposed, including deep learning-based models, which are receiving a lot of attention within the recommender model. The recommender model can be classified into a collaborative filtering method, a content-based method, and a hybrid method that uses these two methods integrally. However, these basic methods are gradually losing their status as methodologies in the field as they fail to adapt to internal and external changing factors such as the rapidly changing user-item interaction and the development of big data. On the other hand, the importance of deep learning methodologies in recommender systems is increasing because of its advantages such as nonlinear transformation, representation learning, sequence modeling, and flexibility. In this paper, among deep learning methodologies, RNN, CNN, and GAN-based models suitable for sequential modeling that can accurately and flexibly analyze user-item interactions are classified, compared, and analyzed.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.