• Title/Summary/Keyword: empirical Bayes

Search Result 105, Processing Time 0.018 seconds

Movie Popularity Classification Based on Support Vector Machine Combined with Social Network Analysis

  • Dorjmaa, Tserendulam;Shin, Taeksoo
    • Journal of Information Technology Services
    • /
    • v.16 no.3
    • /
    • pp.167-183
    • /
    • 2017
  • The rapid growth of information technology and mobile service platforms, i.e., internet, google, and facebook, etc. has led the abundance of data. Due to this environment, the world is now facing a revolution in the process that data is searched, collected, stored, and shared. Abundance of data gives us several opportunities to knowledge discovery and data mining techniques. In recent years, data mining methods as a solution to discovery and extraction of available knowledge in database has been more popular in e-commerce service fields such as, in particular, movie recommendation. However, most of the classification approaches for predicting the movie popularity have used only several types of information of the movie such as actor, director, rating score, language and countries etc. In this study, we propose a classification-based support vector machine (SVM) model for predicting the movie popularity based on movie's genre data and social network data. Social network analysis (SNA) is used for improving the classification accuracy. This study builds the movies' network (one mode network) based on initial data which is a two mode network as user-to-movie network. For the proposed method we computed degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality as centrality measures in movie's network. Those four centrality values and movies' genre data were used to classify the movie popularity in this study. The logistic regression, neural network, $na{\ddot{i}}ve$ Bayes classifier, and decision tree as benchmarking models for movie popularity classification were also used for comparison with the performance of our proposed model. To assess the classifier's performance accuracy this study used MovieLens data as an open database. Our empirical results indicate that our proposed model with movie's genre and centrality data has by approximately 0% higher accuracy than other classification models with only movie's genre data. The implications of our results show that our proposed model can be used for improving movie popularity classification accuracy.

A study of Bayesian inference on auto insurance credibility application (자동차보험 신뢰도 적용에 대한 베이지안 추론 방식 연구)

  • Kim, Myung Joon;Kim, Yeong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.689-699
    • /
    • 2013
  • This paper studies the partial credibility application method by assuming the empirical prior or noninformative prior informations in auto insurnace business where intensive rating segmentation is expanded because of premium competition. Expanding of rating factor segmetation brings the increase of pricing cells, as a result, the number of cells for partial credibility application will increase correspondingly. This study is trying to suggest more accurate estimation method by considering the Bayesian framework. By using empirically well-known or noninformative information, inducing the proper posterior distribution and applying the Bayes estimate which is minimizing the error loss into the credibility method, we will show the advantage of Bayesian inference by comparison with current approaches. The comparison is implemented with square root rule which is a widely accepted method in insurance business. The convergence level towarding to the true risk will be compared among various approaches. This study introduces the alternative way of redcuing the error to the auto insurance business fields in need of various methods because of more segmentations.

Modeling Consumers' WOM (Word-Of-Mouth) Behavior with Subjective Evaluation and Objective Information on High-tech Products (하이테크 제품에 대한 소비자의 주관적 평가와 객관적 정보 구전 활동에 대한 연구)

  • Chung, Jaihak
    • Asia Marketing Journal
    • /
    • v.11 no.1
    • /
    • pp.73-92
    • /
    • 2009
  • Consumers influence other consumers' brand choice behavior by delivering a variety of objective or subjective information on a particular product, which is called WOM (Word-Of-Mouth) activities. For WOM activities, WOM senders should choose messages to deliver to other consumers. We classify the contents of the messages a consumer chooses for WOM delivery into two categories: Subjective (positive or negative) evaluation and objective information on products. In our study, we regard WOM senders' activities as a choice behavior and introduce a choice model to study the relationship between the choice of different WOM information (WOM with positive or negative subjective evaluation and WOM with objective information) and its influencing factors (information sources and consumer characteristics) by developing two bivariate Probit models. In order to consider the mediating effects of WOM senders' product involvement, product attitude, and their characteristics (gender and age), we develop three second-level models for the propagation of positive evaluations, of negative evaluations, and of objective information on products in an hierarchical Bayesian modeling framework. Our empirical results show that WOM senders' information choice behavior differs according to the types of information sources. The effects of information sources on WOM activities differ according to the types of WOM messages (subjective evaluation (positive or negative) and objective information). Therefore, our study concludes that WOM activities can be partially managed with effective communication plans influencing on consumers' WOM message choice behavior. The empirical results provide some guidelines for consumers' propagation of information on products companies want.

  • PDF

Analysis of Elderly Drivers' Accident Models Considering Operations and Physical Characteristics (고령운전자 운전 및 신체특성을 반영한 교통사고 분석 연구)

  • Lim, Sam Jin;Park, Jun Tae;Kim, Young Il;Kim, Tae Ho
    • Journal of Korean Society of Transportation
    • /
    • v.30 no.6
    • /
    • pp.37-46
    • /
    • 2012
  • The number of traffic accidents caused by elderly drivers over the age of 65 has surged over the past ten years from 37,000 to 274,000 cases. The proportion of elderly drivers' accidents has jumped 3.1 times from 1.2% to 3.7% out of all traffic accidents, and traffic safety organizations are pursuing diverse measures to address the situation. Above all, connecting safety measures with an in-depth research on behavioral and physical characteristics of elderly drivers will prove vital. This study conducted an empirical research linking the driving characteristics and traffic accidents by elderly drivers based on the Driving Aptitude Test items and traffic accident data, which enabled the measurement of behavioral characteristics of elderly drivers. In developing the Influence Model, we applied the zero-inflated Poisson (ZIP) regression model and selected an accident prediction model based on the Bayesian Influence in regards to the ZIP regression model and the zero-inflated negative binomial (ZINB) regression model. According to the results of the AAE analysis, the ZIP regression model was more appropriate and it was found that three variables? prediction of velocity, diversion, and cognitive ability? had a relation of influence with traffic accidents caused by elderly drivers.

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.