• Title/Summary/Keyword: Rating Prediction

Search Result 198, Processing Time 0.028 seconds

A Study on Customer Review Rating Recommendation and Prediction through Online Promotional Activity Analysis - Focusing on "S" Company Wearable Products - (온라인 판매촉진활동 분석을 통한 고객 리뷰평점 추천 및 예측에 관한 연구 : S사 Wearable 상품중심으로)

  • Shin, Ho-cheol
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.4
    • /
    • pp.118-129
    • /
    • 2022
  • The purpose of this report is to study a strategic model of promotion activities through various analysis and sales forecasting by selecting wearable products for domestic online companies and collecting sales data. For data analysis, various algorithms are used for analysis and the results are selected as the optimal model. The gradation boosting model, which is selected as the best result, will allow nine independent variables to be entered, including promotion type, price, amount, gender, model, company, grade, sales date, and region, when predicting dependent variables through supervised learning. In this study, the review values set as dependent variables for each type of sales promotion were studied in more detail through the ensemble analysis technique, and the main purpose is to analyze and predict them. The purpose of this study is to study the grades. As a result of the analysis, the evaluation result is 95% of AUC, and F1 is about 93%. In the end, it was confirmed that among the types of sales promotion activities, value-added benefits affected the number of reviews and review grades, and that major variables affected the review and review grades.

Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation (영화 추천 시스템의 초기 사용자 문제를 위한 장르 선호 기반의 클러스터링 기법)

  • You, Tithrottanak;Rosli, Ahmad Nurzid;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.57-77
    • /
    • 2013
  • Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.

Sentiment analysis on movie review through building modified sentiment dictionary by movie genre (영역별 맞춤형 감성사전 구축을 통한 영화리뷰 감성분석)

  • Lee, Sang Hoon;Cui, Jing;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.97-113
    • /
    • 2016
  • Due to the growth of internet data and the rapid development of internet technology, "big data" analysis is actively conducted to analyze enormous data for various purposes. Especially in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of existing structured data analysis. Various studies on sentiment analysis, the part of text mining techniques, are actively studied to score opinions based on the distribution of polarity of words in documents. Usually, the sentiment analysis uses sentiment dictionary contains positivity and negativity of vocabularies. As a part of such studies, this study tries to construct sentiment dictionary which is customized to specific data domain. Using a common sentiment dictionary for sentiment analysis without considering data domain characteristic cannot reflect contextual expression only used in the specific data domain. So, we can expect using a modified sentiment dictionary customized to data domain can lead the improvement of sentiment analysis efficiency. Therefore, this study aims to suggest a way to construct customized dictionary to reflect characteristics of data domain. Especially, in this study, movie review data are divided by genre and construct genre-customized dictionaries. The performance of customized dictionary in sentiment analysis is compared with a common sentiment dictionary. In this study, IMDb data are chosen as the subject of analysis, and movie reviews are categorized by genre. Six genres in IMDb, 'action', 'animation', 'comedy', 'drama', 'horror', and 'sci-fi' are selected. Five highest ranking movies and five lowest ranking movies per genre are selected as training data set and two years' movie data from 2012 September 2012 to June 2014 are collected as test data set. Using SO-PMI (Semantic Orientation from Point-wise Mutual Information) technique, we build customized sentiment dictionary per genre and compare prediction accuracy on review rating. As a result of the analysis, the prediction using customized dictionaries improves prediction accuracy. The performance improvement is 2.82% in overall and is statistical significant. Especially, the customized dictionary on 'sci-fi' leads the highest accuracy improvement among six genres. Even though this study shows the usefulness of customized dictionaries in sentiment analysis, further studies are required to generalize the results. In this study, we only consider adjectives as additional terms in customized sentiment dictionary. Other part of text such as verb and adverb can be considered to improve sentiment analysis performance. Also, we need to apply customized sentiment dictionary to other domain such as product reviews.

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning (영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측)

  • Song, Junga;Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.67-83
    • /
    • 2018
  • The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.

Social Network-based Hybrid Collaborative Filtering using Genetic Algorithms (유전자 알고리즘을 활용한 소셜네트워크 기반 하이브리드 협업필터링)

  • Noh, Heeryong;Choi, Seulbi;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.19-38
    • /
    • 2017
  • Collaborative filtering (CF) algorithm has been popularly used for implementing recommender systems. Until now, there have been many prior studies to improve the accuracy of CF. Among them, some recent studies adopt 'hybrid recommendation approach', which enhances the performance of conventional CF by using additional information. In this research, we propose a new hybrid recommender system which fuses CF and the results from the social network analysis on trust and distrust relationship networks among users to enhance prediction accuracy. The proposed algorithm of our study is based on memory-based CF. But, when calculating the similarity between users in CF, our proposed algorithm considers not only the correlation of the users' numeric rating patterns, but also the users' in-degree centrality values derived from trust and distrust relationship networks. In specific, it is designed to amplify the similarity between a target user and his or her neighbor when the neighbor has higher in-degree centrality in the trust relationship network. Also, it attenuates the similarity between a target user and his or her neighbor when the neighbor has higher in-degree centrality in the distrust relationship network. Our proposed algorithm considers four (4) types of user relationships - direct trust, indirect trust, direct distrust, and indirect distrust - in total. And, it uses four adjusting coefficients, which adjusts the level of amplification / attenuation for in-degree centrality values derived from direct / indirect trust and distrust relationship networks. To determine optimal adjusting coefficients, genetic algorithms (GA) has been adopted. Under this background, we named our proposed algorithm as SNACF-GA (Social Network Analysis - based CF using GA). To validate the performance of the SNACF-GA, we used a real-world data set which is called 'Extended Epinions dataset' provided by 'trustlet.org'. It is the data set contains user responses (rating scores and reviews) after purchasing specific items (e.g. car, movie, music, book) as well as trust / distrust relationship information indicating whom to trust or distrust between users. The experimental system was basically developed using Microsoft Visual Basic for Applications (VBA), but we also used UCINET 6 for calculating the in-degree centrality of trust / distrust relationship networks. In addition, we used Palisade Software's Evolver, which is a commercial software implements genetic algorithm. To examine the effectiveness of our proposed system more precisely, we adopted two comparison models. The first comparison model is conventional CF. It only uses users' explicit numeric ratings when calculating the similarities between users. That is, it does not consider trust / distrust relationship between users at all. The second comparison model is SNACF (Social Network Analysis - based CF). SNACF differs from the proposed algorithm SNACF-GA in that it considers only direct trust / distrust relationships. It also does not use GA optimization. The performances of the proposed algorithm and comparison models were evaluated by using average MAE (mean absolute error). Experimental result showed that the optimal adjusting coefficients for direct trust, indirect trust, direct distrust, indirect distrust were 0, 1.4287, 1.5, 0.4615 each. This implies that distrust relationships between users are more important than trust ones in recommender systems. From the perspective of recommendation accuracy, SNACF-GA (Avg. MAE = 0.111943), the proposed algorithm which reflects both direct and indirect trust / distrust relationships information, was found to greatly outperform a conventional CF (Avg. MAE = 0.112638). Also, the algorithm showed better recommendation accuracy than the SNACF (Avg. MAE = 0.112209). To confirm whether these differences are statistically significant or not, we applied paired samples t-test. The results from the paired samples t-test presented that the difference between SNACF-GA and conventional CF was statistical significant at the 1% significance level, and the difference between SNACF-GA and SNACF was statistical significant at the 5%. Our study found that the trust/distrust relationship can be important information for improving performance of recommendation algorithms. Especially, distrust relationship information was found to have a greater impact on the performance improvement of CF. This implies that we need to have more attention on distrust (negative) relationships rather than trust (positive) ones when tracking and managing social relationships between users.

A Study on the Differences of Cognitive Functions, Neurobehavioral Symptoms and Daily Living Functions According to the Lateralization of Lesion in Patients with Non-Traumatic Subcortical Cerebrovascular Disease (비외상성 피질하 뇌혈관질환 환자에서 병소의 편측성에 따른 인지기능, 정신행동증상 및 일상생활기능의 차이에 대한 연구)

  • Park, Young-Soo;Lee, Young-Ho;Choi, Young-Hee;Ko, Dae-Kwan;Chung, Young-Cho;Park, Byoung-Kwan;Kim, Soo-Ji;Chung, Suk-Haui;Ko, Byoung-Hee;Song, Il-Byoung;Park, Kun-Woo;Lee, Dae-Hie
    • Sleep Medicine and Psychophysiology
    • /
    • v.3 no.1
    • /
    • pp.56-67
    • /
    • 1996
  • Objectives : This study was designed to find clinical factors that could be differentiated by the lateralization of lesion and also find clinical factors to predict the lateralization of lesion. Methods : The subjects were 65 cooperative inpatients and outpatients with non-traumatic subcortical cerebrovascular disease without neurologic and psychiatric history from January 1995 to September 1995 ; 48 patients in Kyung Hee University, Oriental Medicine Hospital, 35 patients in Anam Hospital, Korea University were examined as subjects, but authors excluded 20 patients whose data were incomplete or who had uncertain lesions on brain CT or MRI. The 65 patients were divided into three groups-group with left hemispheric lesion, group with right hemispheric lesion, group with both hemispheric lesion-according to the finding of brain imaging study. Their cognitive functions were evaluated by the Benton Neuropsychological Assessment(BNA), their subjective neurobehavioral symptoms by Symptom Check List-90-R(SCL-90-R), their objective neurobehavioral symptoms by Neurobehavioral Rating Scale, and their daily living functions by Geriatric Evaluation by Relative's Rating Instrument(GERRl) and Instrumental Activities of Daily Living Scale(IADLs). Results : The results were as follows : 1) The results of cognitive function test indicated that the group with right hemispheric lesion showed low functions in Tactile Form Perception(left), the group with left hemispheric lesion showed low functions in Finger localization(right), the group with right hemispheric lesion showed low functions in Finger Localization(left). 2) Though, there were little significant differences in subjective neurobehavioral symptoms, the group with right hemispheric lesion showed higher scores in all symptoms except hostility. 3) Though, there were little significant differences in objective neurobehavioral symptoms, the group with both hemispheric lesion showed higher scores in cognition, guilty/disinhibition, the group with left hemispheric lesion showed higher scores in lability of mood, the group with right hemispheric lesion showed highest scores in psychotism, neurotism, agitation-hostility and decreased motivation/emotional withdrawal. 4) There were little significant differences among three groups in Daily Living Functions, but the group with right hemispheric lesion showed the lowest functions in Instrumental Activities of Daily Living. 5) As a result of discriminant analysis on each factor's contribution to the prediction of lesion, Finger Localization(left), Phoneme Discrimination and Tactile Form Perception(right) showed that they had the potentiality to predict lesion. Conclusion : The results suggest that there are little significant differences among the groups of three non-traumatic subcortical cerebrovascular disease in cognitive functions, but the group with right hemispheric lesion showed more serious and various changes in subjective and objective neurobehavioral symptoms, and showed low functions in Instrumental Activities of Daily Living. This results suggest the possibility that the decline of the daily living function in the group with right hemispheric lesion were due to various symptoms, not due to cognitive dysfunction. The confirmation of the possibility should be worked out through the follow-up study of some groups containing cortical lesion. Apart from these findings, Finger Localization, Tactile Form Perception(right) and Phoneme Discrimination suggest that they can be used as clinically valuable cognitive parameters that predict the lateralization of lesion in non-traumatic cerebrovascular disease.

  • PDF

Assessment of System Reliability and Capacity-Rating of Concrete Box-Girder Highway Brdiges (R.C 박스거교의 체계신뢰성 해석 및 안전도 평가)

  • 조효남;신재철
    • Magazine of the Korea Concrete Institute
    • /
    • v.7 no.3
    • /
    • pp.187-198
    • /
    • 1995
  • This paper develops practical and reallstic reliabllity models and methods for the evaluation of system rehability and system rellabllity based ratlng of R.C box glrder bridge superstructures. The precise prediction of reberved carrying capacity of bridge as d system is extremely difficult especially when the brldges are highly redundant and slgnlficantly deter 1or;itcd or dainagetl. Thls papel proposes a nt2w approach for the evaluation of reseived system c,drrying capaaty of br~dges in terms ot equ~vdleiit system strength, which may b~ ddcflned as a brtdge system strength correipcmdlng tu the system rehability of the bridge. This cm be ticrAvcd from an Inverse process bami or1 the con~ept of FOSM(F1rst Order Second Moment) form of system reliabihty index. The sf rength llmt state models for K C box girder br~dges suggested In the paper dre based on the basi~ bending and shear strength And thc system reliatxllty pro,~lerri of box gritier super structure 1s formuldted as parallel serles models obtalncd f ~ o m thc FMA(Fdilure blode Rp proath) based on major failure mc>clmusrns or c~itlcal fdure ,>tatcs of each nuder .WOSM(Ad-vanced First Order Second Moment) and IST(1mportance Sampling Technique) simulation algorithm are used for the reliability analysis of the proposed models.

Prediction of Ground Subsidence Hazard Area Using GIS and Probability Model near Abandoned Underground Coal Mine (GIS 및 확률모델을 이용한 폐탄광 지역의 지반침하 위험 예측)

  • Choi, Jong-Kuk;Kim, Ki-Dong;Lee, Sa-Ro;Kim, Il-Soo;Won, Joong-Sun
    • Economic and Environmental Geology
    • /
    • v.40 no.3 s.184
    • /
    • pp.295-306
    • /
    • 2007
  • In this study, we predicted areas vulnerable to ground subsidence near abandoned underground coal mine at Sam-cheok City in Korea using a probability (frequency ratio) model with Geographic Information System (GIS). To extract the factors related to ground subsidence, a spatial database was constructed from a topographical map, geo-logical map, mining tunnel map, land characteristic map, and borehole data on the study area including subsidence sites surveyed in 2000. Eight major factors were extracted from the spatial analysis and the probability analysis of the surveyed ground subsidence sites. We have calculated the decision coefficient ($R^2$) to find out the relationship between eight factors and the occurrence of ground subsidence. The frequency ratio model was applied to deter-mine each factor's relative rating, then the ratings were overlaid for ground subsidence hazard mapping. The ground subsidence hazard map was then verified and compared with the surveyed ground subsidence sites. The results of verification showed high accuracy of 96.05% between the predicted hazard map and the actual ground subsidence sites. Therefore, the quantitative analysis of ground subsidence near abandoned underground coal mine would be possible with a frequency ratio model and a GIS.

Mathematical Performance Predictions of Mathematically Gifted Students with Gifted Behavior Ratings by Teachers and Parents (수학영재의 수행능력에 대한 교사 및 부모 평정의 예측력)

  • Lee, Mi-Soon
    • Journal of Gifted/Talented Education
    • /
    • v.21 no.4
    • /
    • pp.829-845
    • /
    • 2011
  • The purpose of this study was to examine mathematical performance predictions with gifted behavior ratings by teachers and parents. The participants of this study were 787 elementary 5th and 6th grade gifted students who took the mathematical performance test. This study asked gifted teachers and parents to rate gifted behaviors of these gifted students with using SRBCSS-R (Renzulli et al., 2002, 2009). The results indicated that gifted teachers rated gifted behaviors of the 5th grade gifted students higher than the 6th grade gifted students, except in 'mathematical characteristics.' Gifted teachers rated 'learning' gifted behaviors of male gifted students higher than those of female gifted students. In the meanwhile, parents of the 5th grade gifted students rated gifted behaviors higher than parents of the 6th grade gifted students in 'learning' and 'motivation.' In comparing the gifted behavior ratings by gifted teachers and parents, there were significant differences in 'learning' and 'motivation' ratings. That is, gifted teachers rated significantly higher 'learning' and 'motivation' of gifted students than parents. When this study explored the prediction of gifted behavior ratings by gifted teachers and parents on mathematical performances of gifted students, 'learning' and 'mathematical characteristics' ratings by gifted teachers predicted the mathematical performances of gifted students.

A Study on the Development of Forest Fire Occurrence Probability Model using Canadian Forest Fire Weather Index -Occurrence of Forest Fire in Kangwon Province- (캐나다 산불 기상지수를 이용한 산불발생확률모형 개발 -강원도 지역 산불발생을 중심으로-)

  • Park, Houng-Sek;Lee, Si-Young;Chae, Hee-Mun;Lee, Woo-Kyun
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.9 no.3
    • /
    • pp.95-100
    • /
    • 2009
  • Fine fuel moisture code (FFMC), a main component of forest fire weather index(FWI) in the Canadian forest fire danger rating system(CFFDRS), indicated a probability of ignition through expecting a dryness of fine fuels. According to this code, a rising of temperature and wind velocity, a decreasing of precipitation and decline of humidity in a weather condition showed a rising of a danger rate for the forest fire. In this study, we analyzed a weather condition during 5 years in Kangwon province, calculated a FFMC and examined an application of FFMC. Very low humidity and little precipitation was a characteristic during spring and fall fire season in Kangwon province. 75% of forest fires during 5 years occurred in this season and especially 90% of forest fire during fire season occurred in spring. For developing of the prediction model for a forest fire occurrence probability, we used a logistic regression function with forest fire occurrence data and classified mean FFMC during 10 days. Accuracy of a developed model was 63.6%. To improve this model, we need to deal with more meteorological data during overall seasons and to associate a meteorological condition with a forest fire occurrence with more research results.