DOI QR코드

DOI QR Code

Image Mood Classification Using Deep CNN and Its Application to Automatic Video Generation

심층 CNN을 활용한 영상 분위기 분류 및 이를 활용한 동영상 자동 생성

  • 조동희 (광운대학교 소프트웨어학부) ;
  • 남용욱 (광운대학교 컴퓨터과학과) ;
  • 이현창 (광운대학교 컴퓨터과학과) ;
  • 김용혁 (광운대학교 소프트웨어학부)
  • Received : 2019.08.16
  • Accepted : 2019.09.20
  • Published : 2019.09.28

Abstract

In this paper, the mood of images was classified into eight categories through a deep convolutional neural network and video was automatically generated using proper background music. Based on the collected image data, the classification model is learned using a multilayer perceptron (MLP). Using the MLP, a video is generated by using multi-class classification to predict image mood to be used for video generation, and by matching pre-classified music. As a result of 10-fold cross-validation and result of experiments on actual images, each 72.4% of accuracy and 64% of confusion matrix accuracy was achieved. In the case of misclassification, by classifying video into a similar mood, it was confirmed that the music from the video had no great mismatch with images.

본 연구에서는 영상의 분위기를 심층 합성곱 신경망을 통해 8 가지로 분류하고, 이에 맞는 배경 음악을 적용하여 동영상을 자동적으로 생성하였다. 수집된 이미지 데이터를 바탕으로 다층퍼셉트론을 사용하여 분류 모델을 학습한다. 이를 활용하여 다중 클래스 분류를 통해 동영상 생성에 사용할 이미지의 분위기를 예측하며, 미리 분류된 음악을 매칭시켜 동영상을 생성한다. 10겹 교차 검증의 결과, 72.4%의 정확도를 얻을 수 있었고, 실제 영상에 대한 실험에서 64%의 오차 행렬 정확도를 얻을 수 있었다. 오답의 경우, 주변의 비슷한 분위기로 분류하여 동영상에서 나오는 음악과 크게 위화감이 없음을 확인하였다.

Keywords

References

  1. Y. Yan, M. Chen, M. L. Shyu & S. C. Chen (2015, December). Deep learning for imbalanced multimedia data classification. 2015 IEEE International Symposium on Multimedia. (pp. 483-488). DOI : 10.1109/ISM.2015.126
  2. M. K. Lee, D. H. Kim, D. Y. Choi, and B. C. Song. (2017). Emotion recognition system based deep learning. Journal of the Korean Society Of Broad Engineers, 16-18.
  3. S. H. Kim. (2016). Sentiment classification for videos using deep learning algorithms. Master dissertation. Seoul University, Seoul.
  4. J. A. Russell. (1980). A circumplex model of affect. Journal of personality and social psychology, 39(6), 1161-1178 DOI : 10.1037/h0077714
  5. D. H. Ko, H. K. Moon, J. W. Jun, J. M. Yu & M. G. Jeon. (2017). Face Verification based on DeepConvolutional Nerual Network. Journal of The Korean Institute of Information Scientists and Engineers
  6. D. G. Lee. (2018). Classification of Trucks using Convolutional Neural Network. Journal of Convergence for Information Technology, 8(6), 375-380 DOI : 10.22156/CS4SMB.2018.8.6.375
  7. A. M. Ramadhani, N. R. Kim & H. R. Choi. (2018). Predicting Employment Earning using Deep Convolutional Neural Networks. Journal of Digital Convergence, 16(6), 151-161. DOI : 10.14400/JDC.2018.16.6.151
  8. J. Y. Lee, C. B. Moon and B. M. Kim. (2018). Music crawler for mood-based music classification and retrieval systems. Journal of Korea Information Science Society, 699-701
  9. C. W. Lee. (2005). Development of automatic synchronization tool for scene and background music. Chungcheongbuk-do : INET.
  10. C. Olston & M. Najork. (2010). Web crawling. Foundations and $Trends^{(R)}$ in Information Retrieval, 4(3), 175-246. DOI : 10.1561/1500000017
  11. G. B. Huang, H. Zhou, X. Ding & R. Zhang. (2011). Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(2), 513-529. DOI : 10.1109/TSMCB.2011.2168604
  12. M. Riedmiller. (1994). Advanced supervised learning in multi-layer perceptrons-from backpropagation to adaptive learning algorithms. Computer Standards & Interfaces, 16(3), 265-278. DOI : 10.1016/0920-5489(94)90017-5
  13. A. Krizhevsky, I. Sutskever, & G. E. Hinton. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 1097-1105 DOI : 10.1145/3065386
  14. S. Hochreiter. (1998). The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(02), 107-116. DOI : 10.1142/S0218488598000094
  15. R. A. Dunne & N. A. Campbell. (1997, June). On the pairing of the softmax activation and cross-entropy penalty functions and the derivation of the softmax activation function. Proc. 8th Aust. Conf. on the Neural Networks, Melbourne. DOI : 10.1.1.49.6403
  16. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, & R. Salakhutdinov. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958. DOI : 10.1214/12-AOS1000
  17. A. Krogh & J. Vedelsby. (1995). Neural network ensembles, cross validation, and active learning. Advances in neural information processing systems. (pp. 231-238). Cambridge,MA:MITPress.
  18. M. Sokolova & G. Lapalme. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management, 45(4), 427-437. DOI : 10.1016/j.ipm.2009.03.002
  19. T. Kincl, M. Novak & J. Pribil. (2013, November). Getting inside the minds of the customers: automated sentiment analysis. ECMLG2013-Proceedings For the 9th European Conference on Management Leadership and Governance: ECMLG 2013. (pp. 122-128). Klagenfurt : Academic Conferences Limited
  20. V. Gajarla & A. Gupta. (2015). Emotion detection and sentiment analysis of images. Atlanta : Georgia Institute of Technology.