DOI QR코드

DOI QR Code

Predicting the Number of Movie Audiences Through Variable Selection Based on Information Gain Measure

정보 소득율 기반의 변수 선택을 통한 영화 관객 수 예측

  • Received : 2019.06.06
  • Accepted : 2019.06.21
  • Published : 2019.06.30

Abstract

In this study, we propose a methodology for predicting the movie audience based on movie information that can be easily acquired before opening and effectively distinguishing qualitative variables. In addition, we constructed a model to estimate the number of movie audiences at the time of data acquisition through the configured variables. Another purpose of this study is to provide a criterion for categorizing success of movies with qualitative characteristics. As an evaluation criterion, we used information gain ratio which is the node selection criterion of C4.5 algorithm. Through the procedure we have selected 416 movie data features. As a result of the multiple linear regression model, the performance of the regression model using the variables selection method based on the information gain ratio was excellent.

Keywords

References

  1. Chang, J. Y., "An Experimental Evaluation of Box office Revenue Prediction through Social Bigdata Analysis and Machine Learning", The Journal of The Institute of Internet, Broadcasting and Communication, Vol. 17, No. 3, 2017, pp. 167-173. https://doi.org/10.7236/JIIBC.2017.17.3.167
  2. Choi, B., "Study on Competitiveness and Economic Impact of Film Industry", Korea Film Council, 2016-03, 2016.
  3. Industrial Policy Research Team, 2016 Korean film closing, Korea Film Council, 2016.
  4. Jang, B., Lee, Y., Kim, B., and Nam, S., "Elaborating Movie Performance Forecast Through Psychological Variables : Focusing on the First Week Performance", Korean Society for Journalism & Communication Studies, Vol. 53, No. 4, 2009, pp. 346-371.
  5. Jeon, S. and Son, Y. S., "Prediction of box office using data mining", Korean Journal of Applied Statistics, Vol. 29, No. 7, 2016, pp. 1257-2170. https://doi.org/10.5351/KJAS.2016.29.7.1257
  6. Kang, S. J., "Analysis Box Office Success of A Movie-Focused on Commercial Film Released in 2016", Journal of the Korean Entertainment Industry Association, Vol. 11, No. 5, 2017, pp. 1-15.
  7. Kim, Y. and Kwon, O., "Movie Performance Indicators to Predict for Investors", Journal of the Korean Data Analysis Society, Vol. 19, No. 4, 2017, pp. 1963-1975. https://doi.org/10.37727/jkdas.2017.19.4.1963
  8. Lee, J., A study on movie success prediction model using textmining and naive bayes, Hongik University, Thesis for Master degree, 2017.
  9. Litman, B. and Kohl, L. S., "Predicting financial success of motion pictures : The '80s experience", Journal of Media Economics, Vol. 2, Issue 2, 1989, pp. 35-50. https://doi.org/10.1080/08997768909358184
  10. Litman, B. R., "Predicting success of theatrical movies : An empirical study", Journal of Popular Culture, Vol. 16, No. 4, 1983, pp. 159-175. https://doi.org/10.1111/j.0022-3840.1983.1604_159.x
  11. Moon, J., Analyzing the characteristics of movie VOD success using decision tree and multiple linear regression, Seoul National University of Science and Technology, Thesis for Master degree, 2017.
  12. Prag, J. and Casavant, J., "An empirical study of the determinants of revenues and marketing expenditures in the motion picture industry", Journal of Cultural Economics, Vol. 18, Issue 3, 1994, pp. 217-235. https://doi.org/10.1007/BF01080227
  13. Quinlan, J. R., "C4. 5 : Programming for machine learning", Morgan Kauffmann, Vol. 38, 1993.
  14. Quinlan, J. R., "Induction of decision trees", Machine Learning, Vol. 1, 1986, pp. 81-106. https://doi.org/10.1007/BF00116251
  15. Rasheed, Z. and Shah, M., "Movie Genre Classification By Exploiting Audio-visual Features of Previews", International Conference on Pattern Recognition, IEEE, Vol. 2, 2002, pp. 1086-1089.
  16. Son, J., "2016 Korean movie profitability analysis", Korea Film Council, 2018.
  17. Stimpert, J. L. and Laux, J. A., "factors Influencing Motion Picture Success : Empirical Review And Update", Journal of Business & Economics Research, Vol. 6, No. 11, 2008, pp. 39-52.