DOI QR코드

DOI QR Code

Analysis and Design of Arts and Culture Content Creation Tool powered by Artificial Intelligence

인공지능 기반 문화예술 콘텐츠 창작 기술 분석 및 도구 설계

  • Shin, Choonsung (Program of Media Content and Culture Technology, Graduate School of Culture, Chonnam National University) ;
  • Jeong, Hieyong (Department of Artificial Intelligence Convergence, Chonnam National University)
  • 신춘성 (전남대학교 문화전문대학원, 미디어콘텐츠.컬쳐테크전공) ;
  • 정희용 (전남대학교 인공지능학부)
  • Received : 2021.07.01
  • Accepted : 2021.07.28
  • Published : 2021.09.30

Abstract

This paper proposes an arts and culture content creation tool powered by artificial intelligence. With the recent advances in technologies including artificial intelligence, there are active research activities on creating art and culture contents. However, it is still difficult and cumbersome for those who are not familiar with programming and artificial intelligence. In order to deal with the content creation with new technologies, we analyze related creation tools, services and technologies that process with raw visual and audio data, generate new media contents and visualize intermediate results. We then extract key requirements for a future creation tool for creators who are not familiar with programming and artificial intelligence. We finally introduce an intuitive and integrated content creation tool for end-users. We hope that this tool will allow creators to intuitively and creatively generate new media arts and culture contents based on not only understanding given data but also adopting new technologies.

본 논문은 콘텐츠 창작에 새로운 방법과 다양한 가능성을 제공하고 있는 인공지능 기반의 문화예술 콘텐츠 창작기술을 조사하고 이를 바탕으로 일반인을 위한 직관적인 창작도구를 제안한다. 최근 인공지능을 활용한 창작 기술이 다양하게 제안되고 있지만, 정해진 목적으로 개발 및 서비스되고 있어서 창작과 융합 측면에서 확장성과 활용성에 매우 제한이 많다. 본 논문은 인공지능을 바탕으로 한 다양한 데이터 분석과 처리, 콘텐츠 생성 및 창작 및 시각화 기술동향을 살펴보고, 이를 바탕으로 일반인 창작자를 위한 직관적인 창작도구를 제안한다. 제안된 창작도구는 사용자, 창작환경 및 인공지능 특성을 반영하였으며 문화예술 콘텐츠 창작과정에 문화예술 데이터를 처리 및 변환하면서 인공지능 모델을 접목해 새로운 콘텐츠를 생성하기 위한 요소로 구성된다. 이러한 인공지능을 활용한 창작 도구는 방대하고 다양한 문화예술 데이터를 효과적으로 처리 및 구조화하고 다양한 생성 및 창작 모델을 적용하여 창작에 필요한 시간을 줄이는 동시에 새로운 아이디어를 실험하도록 지원한다. 제안한 문화예술 콘텐츠 창작도구는 창작자들이 인공지능과 관련 기술을 쉽게 다루면서, 주어진 데이터를 다각적으로 이해하면서 새로운 아이디어와 창의성을 발현하기 위한 초석을 제공할 것으로 기대된다.

Keywords

Acknowledgement

This work was financially supported by Chonnam National University (Grant No. 2020-2020).

References

  1. Google Arts & Culture, https://artsandculture.google.com/ (accessed June. 27, 2021)
  2. The Next Rembrandt, https://thenextrembrandt.pr.co/(accessed June. 27, 2021)
  3. Magenta, https://magenta.tensorflow.org/(accessed June. 27, 2021)
  4. Deep dream generator, https://deepdreamgenerator.com/(accessed June. 27, 2021)
  5. Teachable Machine, https://teachablemachine.withgoogle.com/(accessed June. 27, 2021)
  6. D. M. Blei, A. Y. Ng, M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, Vol. 3, pp. 993-1022, 2003.
  7. S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, "Deep Learning-based Text Classification: A Comprehensive Review. ACM Computing Survey," Vol. 54, No. 3, Article 62, 2021. doiI:https://doi.org/10.1145/3439726
  8. Z. Kastrati, L. Ahmedi, A. Kurti, F. Kadriu, D. Murtezaj, F. Gashi, "A Deep Learning Sentiment Analyser for Social Media Comments in Low-Resource Languages," MDPI Electronics, Vol. 10, No 1133, 2021. https://doi.org/10.3390/electronics10101133
  9. tSNE Java Script demo, https://cs.stanford.edu/people/karpathy/tsnejs/ (accessed June. 27, 2021)
  10. C. Sievert, K. Shirley, "LDAvis: A method for visualizing and interpreting topics," In Proceedings of the workshop on interactive language learning, visualization, and interfaces, pp. 63-70, 2014.
  11. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," In Proc of the 25th International Conference on Neural Information Processing Systems(NIPS'12), Vol. 1,Curran Associates Inc., Red Hook, NY, USA, 1097-1105.
  12. K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," In Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016. doi: 10.1109/CVPR.2016.90.
  13. J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," In Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525, 2017. doi: 10.1109/CVPR.2017.690.
  14. J. Redmon, A. Farhadi, "YOLOv3: An Incremental Improvement," ArXiv abs/1804.02767, 2018.
  15. Z. Cao, G. Hidalgo, T. Simon, S. -E. Wei and Y. Sheikh, "OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields," In Proc of IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 1, pp. 172-186, 2021. doi: 10.1109/TPAMI.2019.2929257.
  16. S. Hershey, S. Chaudhuri, D. P. W. Ellis, J. F. Gemmeke, A. Jansen, C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold, M. Slaney, R. Weiss, K. Wilson, "CNN architectures for large-scale audio classification," In Proc of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 131-135, doi: 10.1109/ICASSP.2017.7952132.
  17. T. Ishibashi, Y. Nakao, and Y. Sugano, "Investigating audio data visualization for interactive sound recognition," In Proc of the 25th International Conference on Intelligent User Interfaces (IUI), Association for Computing Machinery, New York, NY, USA, 67-77. doi:https://doi.org/10.1145/3377325.3377483
  18. Z. Wolkowicz, S. Brooks, V. Keselj, "Midivis: Visualizing Music Structure via Similarity Matrices," In Proc of International computer music conference(ICMC), pp. 53-56, 2009.
  19. A.S. Cowen, X. Fang, D. Sauter, D. Keltner, "What music makes us feel: At least 13 dimensions organize subjective experiences associated with music across different cultures," In Proc of the National Academy of Sciences, Vol. 117(4), pp. 1924-1934, 2020. doi: 10.1073/pnas.1910704117.
  20. L. A. Gatys, A. S. Ecker, M. Bethge, "Image Style Transfer Using Convolutional Neural Networks," In Proc of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414-2423, 2016.
  21. P. Isola, J. Zhu, T. Zhou and A. A. Efros, "Image-to-Image Translation with Conditional Adversarial Networks," In Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967-5976, 2017. doi: 10.1109/CVPR.2017.632.
  22. Autodraw, https://www.autodraw.com/ (accessed June. 27, 2021)
  23. P.Isola, J. Zhu, T. Zhou, A. Efros, "Image-to-Image Translation with Conditional Adversarial Networks," In Proc of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967-5976, 2017. doi: 10.1109/CVPR.2017.632.
  24. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, I. Sutskever, "Zero-Shot Text-to-Image Generation," ArXiv, abs/2102.12092. 2020.
  25. C. Weng, B. Curless, I. Kemelmacher, "Photo Wake-Up: 3D Character Animation From a Single Photo," pp. 5901-5910, 2019. doi: 10.1109/CVPR.2019.00606.
  26. J.P. Briot, G. Hadjeres, F. Pachet, "Deep Learning Techniques for Music Generation - A Survey," Computational Synthesis and Creative Systems, Springer, 2017.
  27. H. Dong, W. Hsiao, Li Yang, and Y. Yang, "MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment," In Proc of the 32nd AAAI Conference on Artificial Intelligence (AAAI), 2018.
  28. L. Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE, Journal of Machine Learning Research, Vol 9, pp. 2579-2605, 2008.
  29. M. Wattenberg, F. Viegas, I. Johnson, "How to Use t-SNE Effectively," Distill, 2016. http://doi.org/10.23915/distill.00002 (accessed June. 27, 2021)
  30. T-SNE visualization, https://lvdmaaten.github.io/tsne/ (accessed June. 27, 2021)
  31. Music visualization, https://www.ocf.berkeley.edu/~acowen/music.html# (accessed June. 27, 2021)