• Title/Summary/Keyword: 영화 평점 예측

Search Result 22, Processing Time 0.022 seconds

A Rating System on Movie Reviews using the Emotion Feature and Kernel Model (감정자질과 커널모델을 이용한 영화평 평점 예측 시스템)

  • Xu, Xiang-Lan;Jeong, Hyoung-Il;Seo, Jung-Yun
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.37-41
    • /
    • 2011
  • 본 논문에서는 최근 많은 관심을 받고 있는 Opinion Mining으로서 사용자들의 자연어 형태의 영화평 문장을 분석하여 자동으로 평점을 예측하는 시스템을 제안한다. 제안 시스템은 영화평 분석에 적합한 어휘 자질, 감정 자질, 가치 자질 및 기타 자질들을 추출하고, 10점 척도의 영화평의 평점을 10개의 범주로 가정하여, 커널모델인 다중 범주 Support Vector Machine (SVM) 모델을 이용하여 높은 성능으로 영화평의 평점을 범주 분류한다.

  • PDF

CNN Architecture Predicting Movie Rating from Audience's Reviews Written in Korean (한국어 관객 평가기반 영화 평점 예측 CNN 구조)

  • Kim, Hyungchan;Oh, Heung-Seon;Kim, Duksu
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.1
    • /
    • pp.17-24
    • /
    • 2020
  • In this paper, we present a movie rating prediction architecture based on a convolutional neural network (CNN). Our prediction architecture extends TextCNN, a popular CNN-based architecture for sentence classification, in three aspects. First, character embeddings are utilized to cover many variants of words since reviews are short and not well-written linguistically. Second, the attention mechanism (i.e., squeeze-and-excitation) is adopted to focus on important features. Third, a scoring function is proposed to convert the output of an activation function to a review score in a certain range (1-10). We evaluated our prediction architecture on a movie review dataset and achieved a low MSE (e.g., 3.3841) compared with an existing method. It showed the superiority of our movie rating prediction architecture.

Sentiment Analysis of movie review for predicting movie rating (영화리뷰 감성 분석을 통한 평점 예측 연구)

  • Jo, Jung-Tae;Choi, Sang-Hyun
    • Management & Information Systems Review
    • /
    • v.34 no.3
    • /
    • pp.161-177
    • /
    • 2015
  • Currently, the influence of the Internet portal sites that can make it quick and easy to contact the vast amount of information is increasing. Users can connect the Internet through a portal to obtain information, such as communication between Internet users, which can be used to meet a variety of purposes. People are exposed to a variety of information from other users in the search for a movie and get information. The impact on the reviews and ratings with the limited number of characters of the film allows users to form a relationship to the movie, decide whether you want to see the movie or find another movie. but, the user can not read the whole movie review. When user see the overall evaluation, the user can receive the correct information. This research conducted a study on the prediction of the rating by the use of review data. Information of reviews, is divided into two main areas: the"fact" and "opinion". "Fact" is to convey the dispassionate information and "Opinion" is, to represent the user's feelings. In this study, we built sentiment dictionary based on the assessment and evaluation of the online review and applied to evaluate other movies. In the comparative study with a simple emotion evaluation technique, we found the suggested algorithm got the more accurate results.

  • PDF

Movie Recommendation System based on Latent Factor Model (잠재요인 모델 기반 영화 추천 시스템)

  • Ma, Chen;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.1
    • /
    • pp.125-134
    • /
    • 2021
  • With the rapid development of the film industry, the number of films is significantly increasing and movie recommendation system can help user to predict the preferences of users based on their past behavior or feedback. This paper proposes a movie recommendation system based on the latent factor model with the adjustment of mean and bias in rating. Singular value decomposition is used to decompose the rating matrix and stochastic gradient descent is used to optimize the parameters for least-square loss function. And root mean square error is used to evaluate the performance of the proposed system. We implement the proposed system with Surprise package. The simulation results shows that root mean square error is 0.671 and the proposed system has good performance compared to other papers.

Semantic analysis via application of deep learning using Naver movie review data (네이버 영화 리뷰 데이터를 이용한 의미 분석(semantic analysis))

  • Kim, Sojin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.19-33
    • /
    • 2022
  • With the explosive growth of social media, its abundant text-based data generated by web users has become an important source for data analysis. For example, we often witness online movie reviews from the 'Naver Movie' affecting the general public to decide whether they should watch the movie or not. This study has conducted analysis on the Naver Movie's text-based review data to predict the actual ratings. After examining the distribution of movie ratings, we performed semantics analysis using Korean Natural Language Processing. This research sought to find the best review rating prediction model by comparing machine learning and deep learning models. We also compared various regression and classification models in 2-class and multi-class cases. Lastly we explained the causes of review misclassification related to movie review data characteristics.

A Model of Predictive Movie 10 Million Spectators through Big Data Analysis (빅데이터 분석을 통한 천만 관객 영화 예측 모델)

  • Yu, Jong-Pil;Lee, Eung-hwan
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.63-71
    • /
    • 2018
  • In the last five years (2013~2017), we analyzed what factors influenced Korean films that have surpassed 10 million viewers in the Korean movie industry, where the total number of moviegoers is over 200 million. In general, many people consider the number of screens and ratings as important factors that affect the audience's success. In this study, four additional factors, including the number of screens and ratings, were established to establish a hypothesis and correlate it with the presence of 10 million spectators through big data analysis. The results were significant, with 91 percent accuracy in predicting 10 million viewers and 99.4 percent accuracy in estimating cumulative attendance.

Recommendation Reflecting User Preferences on Genres (유저의 장르 선호도를 반영한 추천)

  • Lee, Ho-Jong;Hwang, Won-Seok;Kim, Sang-Wook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1285-1286
    • /
    • 2011
  • MovieLens를 대상으로 하는 추천 시스템에 대한 연구 중 k-NN 추천 방법은 정확도가 비교적 높지만 평점을 예측할 수 없는 상황이 발생할 수 있다. 본 논문에서는 기존 방법의 문제점을 해결한 장르기반 추천 방법 제안하고, 실험을 통하여 제안하는 방법이 모든 영화에 대한 평점의 예측이 가능함을 검증한다.

Box Office Hit Prediction Using Data mining and Text mining (데이터마이닝과 텍스트마이닝을 활용한 영화 흥행 예측)

  • Jo, Hyo-jung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.316-318
    • /
    • 2021
  • 영화 수익에 있어 영화의 흥행 여부는 중요한 영향을 끼친다. 영화 흥행 요인은 영화 산업의 규모가 커지면서 많은 제작사들 및 투자자들이 고려해야 하는 사항이 되었다. 따라서 영화의 흥행을 예측하기 위한 많은 모델이 연구되었다. 본 연구의 목적은 선행연구에서 흥행에 유의미한 영향을 끼친다고 밝혀진 스크린 수, 감독명, 제작사명 등의 내재적인 속성과 더불어 온라인 구전 변수를 사용하여 영화 흥행 예측 모델을 만드는 것이다. 이때 기사 수, 블로그 수와 같이 온라인 구전의 크기를 나타내는 변수들을 사용하는 대신 개봉 후 첫 주간의 관람객 리뷰를 텍스트마이닝을 이용하여 전체 리뷰 중 긍정 리뷰의 비율에 따라 점수를 매긴 후 독립변수로 사용한다. 그 후, 데이터 마이닝 기법을 활용하여 만든 모델에 앞서 언급한 독립변수를 입력 값으로 사용하여 영화의 흥행을 예측한다. 최종적으로 의사결정트리와 로지스틱회귀를 수행한 결과 영화 흥행에 영향을 주는 독립변수를 찾고 모델의 성능을 평가하였다. 로지스틱회귀의 결과 관객 수, 평점이 영화의 흥행에 특히 유의한 영향을 끼치는 변수로 선정되었고 리뷰 역시 유의한 변수로 선정되었다. 이때 만들어진 모델은 약 90%의 높은 수준의 정확도를 보여주었다. 의사결정트리의 결과 관객 수가 가장 중요한 변수로 선정되었다.

A Movie Recommendation System based on Fuzzy-AHP and Word2vec (Fuzzy-AHP와 Word2Vec 학습 기법을 이용한 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.18 no.1
    • /
    • pp.301-307
    • /
    • 2020
  • In recent years, a recommendation system is introduced in many different fields with the beginning of the 5G era and making a considerably prominent appearance mainly in books, movies, and music. In such a recommendation system, however, the preference degrees of users are subjective and uncertain, which means that it is difficult to provide accurate recommendation service. There should be huge amounts of learning data and more accurate estimation technologies in order to improve the performance of a recommendation system. Trying to solve this problem, this study proposed a movie recommendation system based on Fuzzy-AHP and Word2vec. The proposed system used Fuzzy-AHP to make objective predictions about user preference and Word2vec to classify scraped data. The performance of the system was assessed by measuring the accuracy of Word2vec outcomes based on grid search and comparing movie ratings predicted by the system with those by the audience. The results show that the optimal accuracy of cross validation was 91.4%, which means excellent performance. The differences in move ratings between the system and the audience were compared with the Fuzzy-AHP system, and it was superior at approximately 10%.

Pairwise fusion approach to cluster analysis with applications to movie data (영화 데이터를 위한 쌍별 규합 접근방식의 군집화 기법)

  • Kim, Hui Jin;Park, Seyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.265-283
    • /
    • 2022
  • MovieLens data consists of recorded movie evaluations that was often used to measure the evaluation score in the recommendation system research field. In this paper, we provide additional information obtained by clustering user-specific genre preference information through movie evaluation data and movie genre data. Because the number of movie ratings per user is very low compared to the total number of movies, the missing rate in this data is very high. For this reason, there are limitations in applying the existing clustering methods. In this paper, we propose a convex clustering-based method using the pairwise fused penalty motivated by the analysis of MovieLens data. In particular, the proposed clustering method execute missing imputation, and at the same time uses movie evaluation and genre weights for each movie to cluster genre preference information possessed by each individual. We compute the proposed optimization using alternating direction method of multipliers algorithm. It is shown that the proposed clustering method is less sensitive to noise and outliers than the existing method through simulation and MovieLens data application.