• Title/Summary/Keyword: Web-Based Training

Search Result 243, Processing Time 0.018 seconds

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

  • Cho, Yoon-Ho;Kim, In-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.159-172
    • /
    • 2010
  • The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.

A Hybrid Recommender System based on Collaborative Filtering with Selective Use of Overall and Multicriteria Ratings (종합 평점과 다기준 평점을 선택적으로 활용하는 협업필터링 기반 하이브리드 추천 시스템)

  • Ku, Min Jung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.85-109
    • /
    • 2018
  • Recommender system recommends the items expected to be purchased by a customer in the future according to his or her previous purchase behaviors. It has been served as a tool for realizing one-to-one personalization for an e-commerce service company. Traditional recommender systems, especially the recommender systems based on collaborative filtering (CF), which is the most popular recommendation algorithm in both academy and industry, are designed to generate the items list for recommendation by using 'overall rating' - a single criterion. However, it has critical limitations in understanding the customers' preferences in detail. Recently, to mitigate these limitations, some leading e-commerce companies have begun to get feedback from their customers in a form of 'multicritera ratings'. Multicriteria ratings enable the companies to understand their customers' preferences from the multidimensional viewpoints. Moreover, it is easy to handle and analyze the multidimensional ratings because they are quantitative. But, the recommendation using multicritera ratings also has limitation that it may omit detail information on a user's preference because it only considers three-to-five predetermined criteria in most cases. Under this background, this study proposes a novel hybrid recommendation system, which selectively uses the results from 'traditional CF' and 'CF using multicriteria ratings'. Our proposed system is based on the premise that some people have holistic preference scheme, whereas others have composite preference scheme. Thus, our system is designed to use traditional CF using overall rating for the users with holistic preference, and to use CF using multicriteria ratings for the users with composite preference. To validate the usefulness of the proposed system, we applied it to a real-world dataset regarding the recommendation for POI (point-of-interests). Providing personalized POI recommendation is getting more attentions as the popularity of the location-based services such as Yelp and Foursquare increases. The dataset was collected from university students via a Web-based online survey system. Using the survey system, we collected the overall ratings as well as the ratings for each criterion for 48 POIs that are located near K university in Seoul, South Korea. The criteria include 'food or taste', 'price' and 'service or mood'. As a result, we obtain 2,878 valid ratings from 112 users. Among 48 items, 38 items (80%) are used as training dataset, and the remaining 10 items (20%) are used as validation dataset. To examine the effectiveness of the proposed system (i.e. hybrid selective model), we compared its performance to the performances of two comparison models - the traditional CF and the CF with multicriteria ratings. The performances of recommender systems were evaluated by using two metrics - average MAE(mean absolute error) and precision-in-top-N. Precision-in-top-N represents the percentage of truly high overall ratings among those that the model predicted would be the N most relevant items for each user. The experimental system was developed using Microsoft Visual Basic for Applications (VBA). The experimental results showed that our proposed system (avg. MAE = 0.584) outperformed traditional CF (avg. MAE = 0.591) as well as multicriteria CF (avg. AVE = 0.608). We also found that multicriteria CF showed worse performance compared to traditional CF in our data set, which is contradictory to the results in the most previous studies. This result supports the premise of our study that people have two different types of preference schemes - holistic and composite. Besides MAE, the proposed system outperformed all the comparison models in precision-in-top-3, precision-in-top-5, and precision-in-top-7. The results from the paired samples t-test presented that our proposed system outperformed traditional CF with 10% statistical significance level, and multicriteria CF with 1% statistical significance level from the perspective of average MAE. The proposed system sheds light on how to understand and utilize user's preference schemes in recommender systems domain.

KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)

  • Park, Sang-Min;Na, Chul-Won;Choi, Min-Seong;Lee, Da-Hee;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.219-240
    • /
    • 2018
  • Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.