• Title/Summary/Keyword: predicting movie success

Search Result 11, Processing Time 0.025 seconds

Predicting the Number of Movie Audiences Through Variable Selection Based on Information Gain Measure (정보 소득율 기반의 변수 선택을 통한 영화 관객 수 예측)

  • Park, Hyeon-Mock;Choi, Sang Hyun
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.3
    • /
    • pp.19-27
    • /
    • 2019
  • In this study, we propose a methodology for predicting the movie audience based on movie information that can be easily acquired before opening and effectively distinguishing qualitative variables. In addition, we constructed a model to estimate the number of movie audiences at the time of data acquisition through the configured variables. Another purpose of this study is to provide a criterion for categorizing success of movies with qualitative characteristics. As an evaluation criterion, we used information gain ratio which is the node selection criterion of C4.5 algorithm. Through the procedure we have selected 416 movie data features. As a result of the multiple linear regression model, the performance of the regression model using the variables selection method based on the information gain ratio was excellent.

Predicting Movie Success based on Machine Learning Using Twitter (트위터를 이용한 기계학습 기반의 영화흥행 예측)

  • Yim, Junyeob;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.7
    • /
    • pp.263-270
    • /
    • 2014
  • This paper suggests a method for predicting a box-office success of the film. Lately, as the growth of the film industry, a variety of studies for the prediction of market demand is being performed. The product life cycle of film is relatively short cultural goods. Therefore, in order to produce stable profits, marketing costs before opening as well as the number of screen after opening need a plan. To fulfill this plan, the demand for the product and the calculation of economic profit scale should be preceded. The cases of existing researches, as a variable for predicting, primarily use the factors of competition of the market or the properties of the film. However, the proportion of the potential audiences who purchase the goods is relatively insufficient. Therefore, in this paper, in order to consider people's perception of a movie, Twitter was utilized as one of the survey samples. The existing variables and the information extracted from Twitter are defined as off-line and on-line element, and applied those two elements in machine learning by combining. Through the experiment, the proposed predictive techniques are validated, and the results of the experiment predicted the chance of successful film with about 95% of accuracy.

A Study for the Development of Motion Picture Box-office Prediction Model (영화 흥행 결정 요인과 흥행 성과 예측 연구)

  • Kim, Yon-Hyong;Hong, Jeong-Han
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.6
    • /
    • pp.859-869
    • /
    • 2011
  • Interest has increased in academic research regarding key factors that drive box-office success as well as the ability to predict the box-office success of a movie from a commercial perspective. This study analyzed the relationship between key success factors of a movie and box office records based on movies released in 2010 in Korea. At the pre-production investment decision-making stage, the movie genre, motion picture rating, director power, and actor power were statistically significant. At the stage of distribution decision-making process after movie production, among other factors, the influence of star actors, number of screens, power of distributors, and social media turned out to be statistically significant. We verified movie success factors through the application of a Multinomial Logit Model that used the concept of choice probabilities. The Multinomial Logit Model resulted in a higher level of accuracy in predicting box-office success compared to the Artificial Neural Network and Discriminant Analysis.

Predicting Financial Success of a Movie Using Multiple Regression Analysis (다중회귀 분석을 이용한 영화 흥행 예측)

  • Jeong, Hoe-Yun;Yang, Hyung-Jeong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2013.07a
    • /
    • pp.275-278
    • /
    • 2013
  • 영화의 흥행 요소를 파악하여 영화의 흥행 여부를 예측하는 것은 영화의 수익성 부분에서 아주 중요하다. 영화 시장이 과거와는 다르게 증가함에 따라, 다양한 영화 흥행에 관한 예측 연구들이 개발되었다. 본 논문에서는 영화 흥행 요소들을 수집하고 다중회귀 분석을 통해서 유의수준을 만족하는 흥행 요소들을 선택한다. 그 후, 이러한 요소들을 예측 방법들의 입력값으로 사용하여 영화 흥행을 예측한다. 성능을 비교하기 위해 본 논문에서 제안한 방법과 현재 개발된 영화 흥행 예측 방법(다중회귀, 의사결정트리, 인공신경망)들을 정확도와 평균제곱근오차를 통해 예측 모형의 성능을 비교한다. 그 결과, 다중 회귀 분석을 통해 유의한 흥행요소들만을 고려한 예측 방법의 정확도가 모든 흥행 요소들을 고려한 예측 방법보다 평균 8.2% 향상되었고, 현재까지 개발된 영화 흥행 예측 방법보다 더 높은 예측 성능을 보여준다.

  • PDF

Text Mining and Sentiment Analysis for Predicting Box Office Success

  • Kim, Yoosin;Kang, Mingon;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.4090-4102
    • /
    • 2018
  • After emerging online communications, text mining and sentiment analysis has been frequently applied into analyzing electronic word-of-mouth. This study aims to develop a domain-specific lexicon of sentiment analysis to predict box office success in Korea film market and validate the feasibility of the lexicon. Natural language processing, a machine learning algorithm, and a lexicon-based sentiment classification method are employed. To create a movie domain sentiment lexicon, 233,631 reviews of 147 movies with popularity ratings is collected by a XML crawling package in R program. We accomplished 81.69% accuracy in sentiment classification by the Korean sentiment dictionary including 706 negative words and 617 positive words. The result showed a stronger positive relationship with box office success and consumers' sentiment as well as a significant positive effect in the linear regression for the predicting model. In addition, it reveals emotion in the user-generated content can be a more accurate clue to predict business success.

Big Data Preprocessing for Predicting Box Office Success (영화 흥행 실적 예측을 위한 빅데이터 전처리)

  • Jun, Hee-Gook;Hyun, Geun-Soo;Lim, Kyung-Bin;Lee, Woo-Hyun;Kim, Hyoung-Joo
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.12
    • /
    • pp.615-622
    • /
    • 2014
  • The Korean film market has rapidly achieved an international scale, and this has led to a need for decision-making based on analytical methods that are more precise and appropriate. In this modern era, a highly advanced information environment can provide an overwhelming amount of data that is generated in real time, and this data must be properly handled and analyzed in order to extract useful information. In particular, the preprocessing of large data, which is the most time-consuming step, should be done in a reasonable amount of time. In this paper, we investigated a big data preprocessing method for predicting movie box office success. We analyzed the movie data characteristics for specialized preprocessing methods, and used the Hadoop MapReduce framework. The experimental results showed that the preprocessing methods using big data techniques are more effective than existing methods.

A Comparison of Predicting Movie Success between Artificial Neural Network and Decision Tree (기계학습 기반의 영화흥행예측 방법 비교: 인공신경망과 의사결정나무를 중심으로)

  • Kwon, Shin-Hye;Park, Kyung-Woo;Chang, Byeng-Hee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.4
    • /
    • pp.593-601
    • /
    • 2017
  • In this paper, we constructed the model of production/investment, distribution, and screening by using variables that can be considered at each stage according to the value chain stage of the movie industry. To increase the predictive power of the model, a regression analysis was used to derive meaningful variables. Based on the given variables, we compared the difference in predictive power between the artificial neural network, which is a machine learning analysis method, and the decision tree analysis method. As a result, the accuracy of artificial neural network was higher than that of decision trees when all variables were added in production/ investment model and distribution model. However, decision trees were more accurate when selected variables were applied according to regression analysis results. In the screening model, the accuracy of the artificial neural network was higher than the accuracy of the decision tree regardless of whether the regression analysis result was reflected or not. This paper has an implication which we tried to improve the performance of movie prediction model by using machine learning analysis. In addition, we tried to overcome a limitation of linear approach by reflecting the results of regression analysis to ANN and decision tree model.

A Model of Predictive Movie 10 Million Spectators through Big Data Analysis (빅데이터 분석을 통한 천만 관객 영화 예측 모델)

  • Yu, Jong-Pil;Lee, Eung-hwan
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.63-71
    • /
    • 2018
  • In the last five years (2013~2017), we analyzed what factors influenced Korean films that have surpassed 10 million viewers in the Korean movie industry, where the total number of moviegoers is over 200 million. In general, many people consider the number of screens and ratings as important factors that affect the audience's success. In this study, four additional factors, including the number of screens and ratings, were established to establish a hypothesis and correlate it with the presence of 10 million spectators through big data analysis. The results were significant, with 91 percent accuracy in predicting 10 million viewers and 99.4 percent accuracy in estimating cumulative attendance.

A Box Office Type Classification and Prediction Model Based on Automated Machine Learning for Maximizing the Commercial Success of the Korean Film Industry (한국 영화의 산업의 흥행 극대화를 위한 AutoML 기반의 박스오피스 유형 분류 및 예측 모델)

  • Subeen Leem;Jihoon Moon;Seungmin Rho
    • Journal of Platform Technology
    • /
    • v.11 no.3
    • /
    • pp.45-55
    • /
    • 2023
  • This paper presents a model that supports decision-makers in the Korean film industry to maximize the success of online movies. To achieve this, we collected historical box office movies and clustered them into types to propose a model predicting each type's online box office performance. We considered various features to identify factors contributing to movie success and reduced feature dimensionality for computational efficiency. We systematically classified the movies into types and predicted each type's online box office performance while analyzing the contributing factors. We used automated machine learning (AutoML) techniques to automatically propose and select machine learning algorithms optimized for the problem, allowing for easy experimentation and selection of multiple algorithms. This approach is expected to provide a foundation for informed decision-making and contribute to better performance in the film industry.

  • PDF

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning (영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측)

  • Song, Junga;Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.67-83
    • /
    • 2018
  • The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.