• Title/Summary/Keyword: movie audience prediction

Search Result 9, Processing Time 0.024 seconds

Predicting movie audience with stacked generalization by combining machine learning algorithms

  • Park, Junghoon;Lim, Changwon
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.217-232
    • /
    • 2021
  • The Korea film industry has matured and the number of movie-watching per capita has reached the highest level in the world. Since then, movie industry growth rate is decreasing and even the total sales of movies per year slightly decreased in 2018. The number of moviegoers is the first factor of sales in movie industry and also an important factor influencing additional sales. Thus it is important to predict the number of movie audiences. In this study, we predict the cumulative number of audiences of films using stacking, an ensemble method. Stacking is a kind of ensemble method that combines all the algorithms used in the prediction. We use box office data from Korea Film Council and web comment data from Daum Movie (www.movie.daum.net). This paper describes the process of collecting and preprocessing of explanatory variables and explains regression models used in stacking. Final stacking model outperforms in the prediction of test set in terms of RMSE.

Movie Box-office Analysis using Social Big Data (소셜 빅데이터를 이용한 영화 흥행 요인 분석)

  • Lee, O-Joun;Park, Seung-Bo;Chung, Daul;You, Eun-Soon
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.527-538
    • /
    • 2014
  • The demand prediction is a critical issue for the film industry. As the social media, such as Twitter and Facebook, gains momentum of late, considerable efforts are being dedicated to prediction and analysis of hit movies based on unstructured text data. For prediction of trends found in commercially successful films, the correlations between the amount of data and hit movies may be analyzed by estimating the data variation by period while opinion mining that assigns sentiment polarity score to data may be employed. However, it is not possible to understand why the audience chooses a certain movie or which attribute of a movie is preferred by using such a quantitative approach. This has limited the efforts to identify factors driving a movie's commercial success. In this regard, this study aims to investigate a movie's attributes that reflect the interests of the audience. This would be done by extracting topic keywords that represent the contents of Twits through frequency measurement based on the collected Twitter data while analyzing responses displayed by the audience. The objective is to propose factors driving a movie's commercial success.

CNN Architecture Predicting Movie Rating from Audience's Reviews Written in Korean (한국어 관객 평가기반 영화 평점 예측 CNN 구조)

  • Kim, Hyungchan;Oh, Heung-Seon;Kim, Duksu
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.1
    • /
    • pp.17-24
    • /
    • 2020
  • In this paper, we present a movie rating prediction architecture based on a convolutional neural network (CNN). Our prediction architecture extends TextCNN, a popular CNN-based architecture for sentence classification, in three aspects. First, character embeddings are utilized to cover many variants of words since reviews are short and not well-written linguistically. Second, the attention mechanism (i.e., squeeze-and-excitation) is adopted to focus on important features. Third, a scoring function is proposed to convert the output of an activation function to a review score in a certain range (1-10). We evaluated our prediction architecture on a movie review dataset and achieved a low MSE (e.g., 3.3841) compared with an existing method. It showed the superiority of our movie rating prediction architecture.

Prediction of movie audience numbers using hybrid model combining GLS and Bass models (GLS와 Bass 모형을 결합한 하이브리드 모형을 이용한 영화 관객 수 예측)

  • Kim, Bokyung;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.447-461
    • /
    • 2018
  • Domestic film industry sales are increasing every year. Theaters are the primary sales channels for movies and the number of audiences using the theater affects additional selling rights. Therefore, the number of audiences using the theater is an important factor directly linked to movie industry sales. In this paper we consider a hybrid model that combines a multiple linear regression model and the Bass model to predict the audience numbers for a specific day. By combining the two models, the predictive value of the regression analysis was corrected to that of the Bass model. In the analysis, three films with different release dates were used. All subset regression method is used to generate all possible combinations and 5-fold cross validation to estimate the model 5 times. In this case, the predicted value is obtained from the model with the smallest root mean square error and then combined with the predicted value of the Bass model to obtain the final predicted value. With the existence of past data, it was confirmed that the weight of the Bass model increases and the compensation is added to the predicted value.

Emotion Prediction System using Movie Script and Cinematography (영화 시나리오와 영화촬영기법을 이용한 감정 예측 시스템)

  • Kim, Jinsu
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.12
    • /
    • pp.33-38
    • /
    • 2018
  • Recently, we are trying to predict the emotion from various information and to convey the emotion information that the supervisor wants to inform the audience. In addition, audiences intend to understand the flow of emotions through various information of non-dialogue parts, such as cinematography, scene background, background sound and so on. In this paper, we propose to extract emotions by mixing not only the context of scripts but also the cinematography information such as color, background sound, composition, arrangement and so on. In other words, we propose an emotional prediction system that learns and distinguishes various emotional expression techniques into dialogue and non-dialogue regions, contributes to the completeness of the movie, and quickly applies them to new changes. The precision of the proposed system is improved by about 5.1% and 0.4%, and the recall is improved by about 4.3% and 1.6%, respectively, when compared with the modified n-gram and morphological analysis.

Analysis of the Involving Mechanism of Kim Eun-Sook Drama : Focused on the Audience's Predictability and the Activities of Constructing Hypotheses (김은숙 드라마 <도깨비>의 몰입기제 구축과정 분석 - 관람자 예측성과 가설 구성 활동을 중심으로 -)

  • Kim, Eui-Jun
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.2
    • /
    • pp.79-91
    • /
    • 2019
  • In the entertainment industry, risk management is crucial for securing competitiveness due to the risk of investment. The competitiveness of contents is reinforced when external factors such as industrial environment and internal factors centering on involving mechanism are simultaneously provided. The involving mechanism is a form of cognitive response behavior of the audience and occurs through signal processing of the brain when watching the image contents. The signal processing of the brain related to the contents watching is mainly performed in the working memory area, and in the case of the captivating movie, the information other than the contents transmitted to the audience is blocked to generate a temporary dissociation state. A dissociation state similar to a symptom such as hypnosis or amnesia occurs when the audience's level of involving is high. On the other hand, contents information in which the audience is concentrating his attention is used intensively for constructing future thinking through an episodic buffer while the inflow of external information is relatively blocked or delayed. The spectator's future thinking configuration takes the form of a hypothesis-forming activity and is based on the predictability of the brain. When these hypothesized behaviors correspond to the problem solving simulation of story and predictability which is an evolutionary function of the brain, the audience' s brain is involved in the contents at a high level. In order for the act to be effective, the factors such as the background of the hypothesis, the subject of the hypothesis, the internal information of the person, the type and position and quantity of the hypothesis information, and the hypothesis relevance and type of information are important. Based on these factors, analysis of the Kim Eun Sook Drama 'Goblin' shows that the above elements are operated in a very organic and meaningful way.

Prediction of Number of Movie Audience Using Feature Minimization and Data Selection (특징 최소화와 데이터 선별을 활용한 영화 관객수 예측)

  • Yang, Youngbo;Yu, Heonchang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.443-446
    • /
    • 2019
  • 빅데이터 분석을 위해 많이 사용하고 있는 기계학습 알고리즘들 중 딥러닝 알고리즘이 많이 활용되고 있으며 분류와 예측에 높은 정확도를 나타내고 있다. 딥러닝 알고리즘의 적용에 따른 많은 장단점들이 있지만, 단점은 분석에 사용되는 특징들이 너무 많다는 것과 분석 모델을 만드는데 사용되는 알고리즘도 여러 가지를 적용하다 보니 분석 시간이 오래 걸린다는 것이다. 이런 단점들은 업무를 파악하면 특징을 최소화할 수 있고 필요로 하는 정보만 선별해서 대표적인 딥러닝 알고리즘 하나에 분석을 하게 되면 분석 시간을 단축시킬 수 있다. 이 실험은 [1], [2]에서 연구한 영화 관객수 예측 모델을 4개의 특징으로 최소화하고 선별된 데이터를 인공신경망 알고리즘 하나로 예측 모델을 생성하였을 때 유의미한 정보를 도출해 낼 수 있는지를 알아보기 위한 것이다. 실험결과는 최종 관객수를 1명 단위까지 정확하게 예측하지는 못했지만 비슷한 수준의 관객수 정보를 예측하였다. 학문적인 접근으로 보았을 때 예측 정확도가 높지 않으면 사용이 불가능한 모델이라고 판단할 수 있지만, 기업 입장으로 접근해 보았을 때 예측 정보가 [1]. [2] 연구 결과에 비해 부족한 수준은 아니다. 총 소요된 시간은 기획 3일, 데이터 수집 및 모델 개발 5일, 분석 시간 10분으로 개발 시간 단축, 업무 효율성 향상, 비용 절감을 기대할 수 있다.

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning (영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측)

  • Song, Junga;Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.67-83
    • /
    • 2018
  • The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.