DOI QR코드

DOI QR Code

Prediction of movie audience numbers using hybrid model combining GLS and Bass models

GLS와 Bass 모형을 결합한 하이브리드 모형을 이용한 영화 관객 수 예측

  • Kim, Bokyung (Department of Applied Statistics, Chung-Ang University) ;
  • Lim, Changwon (Department of Applied Statistics, Chung-Ang University)
  • 김보경 (중앙대학교 응용통계학과) ;
  • 임창원 (중앙대학교 응용통계학과)
  • Received : 2018.03.05
  • Accepted : 2018.06.20
  • Published : 2018.08.31

Abstract

Domestic film industry sales are increasing every year. Theaters are the primary sales channels for movies and the number of audiences using the theater affects additional selling rights. Therefore, the number of audiences using the theater is an important factor directly linked to movie industry sales. In this paper we consider a hybrid model that combines a multiple linear regression model and the Bass model to predict the audience numbers for a specific day. By combining the two models, the predictive value of the regression analysis was corrected to that of the Bass model. In the analysis, three films with different release dates were used. All subset regression method is used to generate all possible combinations and 5-fold cross validation to estimate the model 5 times. In this case, the predicted value is obtained from the model with the smallest root mean square error and then combined with the predicted value of the Bass model to obtain the final predicted value. With the existence of past data, it was confirmed that the weight of the Bass model increases and the compensation is added to the predicted value.

국내 영화 산업 매출은 매년 증가하고 있다. 극장은 영화의 1차 판매 경로이며, 극장을 이용하는 관객 수는 부가판권에 영향을 준다. 따라서 극장을 이용하는 관객의 수는 영화 산업 매출에 직결되는 중요한 요소이다. 본 논문에서 특정일의 관객 수를 예측하기 위하여 다중선형회귀모형과 Bass 모형을 결합한 Hybrid 모형을 고려한다. 두 모형을 결합함으로써 회귀분석의 예측값을 Bass 모형의 예측값으로 보정하였다. 분석에는 개봉일이 모두 다른 세 영화를 이용하였다. All subset regression 방법을 이용해 모든 가능한 조합을 생성하고 5중 교차검증(5-fold cross validation)을 통해 5번 모형을 추정한다. 이 때 제곱근평균오차가 가장 작은 모형으로 예측값을 구한 뒤 Bass 모형의 예측값과 결합해 최종 예측값을 구하게 된다. 과거데이터가 존재할수록 Bass 모형의 가중치는 증가하면서 예측값에 보정효과를 준다는 것을 확인할 수 있었다.

Keywords

References

  1. Bass, F. M. (1969). A new product growth for model consumer durables, Management Science, 15, 215-227. https://doi.org/10.1287/mnsc.15.5.215
  2. Jeon, S. and Son, Y. (2016). Prediction of box office using data mining, The Korean Journal of Applied Statistics, 29, 1257-1270.
  3. Jung, C., Cho, E., Moon, M., and Jung, Y. (2016). Movie demand diffusion pattern analysis: Applying bass diffusion model based on buzz amount of twitter. In Proceedings of the 2016 Fall Conference of the Korea Society of Management Information Systems, 111-115.
  4. Jung, H. and Yang, H. (2013). Predicting financial success of a movie using multiple regression analysis, Korea Society of Computer Information, 21, 275-278.
  5. Korean Film Council (2016). 2015 Korean film industry settlement, Korean Film.
  6. Lee, K. and Jang, U. (2006). Prediction of the box-office record of a movie using Bayesian choice model. In Proceedings of the 2006 Spring Conference of the Korean Institute of Industrial Engineers, 18, 1428-1433.
  7. Park, J., Chung, Y., and Cho, Y. (2015). Using the hierarchical linear model to Forecast Movie Box-office Performance: the effect of Online Word of Mouth, Asia Pacic Journal of Information systems, 25, 563-578. https://doi.org/10.14329/apjis.2015.25.3.563
  8. R Core Team (2014). R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, from: https://www.R-project.org/
  9. Song, J. and Han, S. (2013). Predicting gross box-office revenue for domestic films, Communications for Statistical Applications and Methods, 20, 301-309. https://doi.org/10.5351/CSAM.2013.20.4.301
  10. Talukdar, D., Sudhir, K., and Ainslie, A. (2002). Investigating new product diffusion across products and countries, Marketing Science, 21, 97-114. https://doi.org/10.1287/mksc.21.1.97.161
  11. Venkatraman, N., Loh, L., and Koh, J. (1994). The adoption of Corporate governance mechanisms: a test of competing diffusion models, Management Science, 40, 496-507. https://doi.org/10.1287/mnsc.40.4.496
  12. Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model, Neurocomputing, 50, 159-175. https://doi.org/10.1016/S0925-2312(01)00702-0