DOI QR코드

DOI QR Code

Video-to-Video Generated by Collage Technique

콜라주 기법으로 해석한 비디오 생성

  • Cho, Hyeongrae (Dept. of Media IT Engineering, The Graduate Schol, Seoul National University of Science and Technology) ;
  • Park, Gooman (Dept. of Electronics IT Media Engineering, Seoul National University of Science and Technology)
  • 조형래 (서울과학기술대학교 대학원 미디어IT공학과) ;
  • 박구만 (서울과학기술대학교 전자IT미디어공학과)
  • Received : 2020.11.30
  • Accepted : 2021.01.18
  • Published : 2021.01.30

Abstract

In the field of deep learning, there are many algorithms mainly after GAN in research related to generation, but in terms of generation, there are similarities and differences with art. If the generation in the engineering aspect is mainly to judge the presence or absence of a quantitative indicator or the correct answer and the incorrect answer, the creation in the artistic aspect creates a creation that interprets the world and human life by cross-validating and doubting the correct answer and incorrect answer from various perspectives. In this paper, the video generation ability of deep learning was interpreted from the perspective of collage and compared with the results made by the artist. The characteristic of the experiment is to compare and analyze how much GAN reproduces the result of the creator made with the collage technique and the difference between the creative part, and investigate the satisfaction level by making performance evaluation items for the reproducibility of GAN. In order to experiment on how much the creator's statement and purpose of expression were reproduced, a deep learning algorithm corresponding to the statement keyword was found and its similarity was compared. As a result of the experiment, GAN did not meet much expectations to express the collage technique. Nevertheless, the image association showed higher satisfaction than human ability, which is a positive discovery that GAN can show comparable ability to humans in terms of abstract creation.

딥러닝 분야 중 생성과 관련된 연구는 주로 GAN 이후에 많은 알고리즘이 있는데 생성이라는 측면에서 볼 때 미술과는 다른 점이 있다. 공학적 측면에서의 생성이 주로 정량적 지표나 정답과 오답의 유무를 판단하는 것이라면 미술적 측면에서의 생성이란 다양한 관점에서 정답과 오답을 교차검증하고 의심하여 세상과 인간의 삶을 해석하는 생성을 만들어낸다. 본 논문은 딥러닝의 비디오 생성능력을 콜라주적 관점에서 해석하고 미술작가가 만든 결과물과 비교하였다. 실험의 특징은 콜라주 기법으로 만든 창작자의 결과물을 GAN이 얼마만큼 재현하는지와 창작적인 부분과의 차이점을 비교분석하는 것이고, GAN의 재현력에 대한 성능 평가항목을 만들어 그 만족도를 조사하였다. 창작자의 스테이트먼트와 표현목적을 얼마나 재현했는지에 관한 실험을 위해서는 스테이트먼트 키워드에 해당하는 딥러닝 알고리즘을 찾아 그 유사성을 비교하였으며, 실험결과 GAN은 콜라주 기법을 표현하기에는 기대에 많이 못 미쳤다. 그럼에도 불구하고 이미지 연상에서는 인간의 능력보다 높은 만족도를 보여주었는데 이것은 GAN의 추상화 생성 측면에서 인간과 비견할만한 능력을 보일 수 있다는 긍정적인 발견이라고 하겠다.

Keywords

References

  1. Ahmed Elgammal, Bingchen Liu, Mohamed Elhoseiny, Marian Mazzone, , CAN: Creative Adversarial Networks Generating "Art" by Learning About Styles and Deviating from Style Norms, arXiv:1706.07068v1 [cs.AI] 21 Jun 2017
  2. Mario Klingemann MEMORIES OF PASSERSBY I, http://www.irobotnews.com/news/articleView.html?idxno=16731, (Accessed on November 20, 2020)
  3. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros, Image-to-Image Translation with Conditional Adversarial Networks, https://arxiv.org/pdf/1611.07004.pdf, arXiv:1611.07004v3 [cs.CV] 26 Nov 2018
  4. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro, High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs, NVIDIA Corporation 2 UC Berkeley, https://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_High-Resolution_Image_Synthesis_CVPR_2018_paper.pdf, CVPR 2018
  5. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro, Video-to-Video Synthesis, https://arxiv.org/pdf/1808.06601.pdf, arXiv:1808.06601v2 [cs.CV] 3 Dec 2018
  6. Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, https://arxiv.org/pdf/1703.10593.pdf, arXiv:1703.10593v7 [cs.CV] 24 Aug 2020
  7. Tero Karras, Samuli Laine, Timo Aila, A Style-Based Generator Architecture for Generative Adversarial Networks, https://arxiv.org/pdf/1812.04948.pdf, arXiv:1812.04948v3 [cs.NE] 29 Mar 2019, 2019
  8. Arun Mallya, Ting-Chun Wang, Karan Sapra, and Ming-Yu Liu, World-Consistent Video-to-Video Synthesis, https://arxiv.org/pdf/2007.08509.pdf, arXiv:2007.08509v1 [cs.CV] 16 Jul 2020
  9. Aliaksandr Siarohin, Stephane Lathuiliere, Sergey Tulyakov, Elisa Ricci, Nicu Sebe, First Order Motion Model for Image Animation, https://papers.nips.cc/paper/2019/file/31c0b36aef265d9221af80872ceb62f9-Paper.pdf, NeurIPS 2019
  10. Manuel Ruder, Alexey Dosovitskiy, Thomas Brox, Artistic style transfer for videos, https://arxiv.org/pdf/1604.08610.pdf, arXiv:1604. 08610v2 [cs.CV] 19 Oct 2016
  11. FID, https://velog.io/@tobigs-gm1/evaluationandbias, (Accessed on January 07, 2021)
  12. Inception Score (IS), https://cyc1am3n.github.io/2020/03/01/is_fid.html, (Accessed on January 07, 2021)
  13. precision and recall, https://sumniya.tistory.com/26, (Accessed on January 07, 2021)
  14. Semi-abstract and expression method , https://m.blog.naver.com/PostView.nhn?blogId=noransonamu&logNo=90139891887&proxyReferer=https:%2F%2Fwww.google.com%2F, (Accessed on February 10, 2021)
  15. R-CNNs Tutorial, https://blog.lunit.io/2017/06/01/r-cnns-tutorial, (Accessed on November 19, 2020)
  16. Image pyramid, https://m.blog.naver.com/PostView.nhn?blogId=samsjang&logNo=220508552078&proxyReferer=http:%2F%2Fwww.google.com%2F, (Accessed on February 10, 2021)
  17. Multi-scale Generator, Multi-scale Discriminator, https://dopelemon.me/pix2pixhd.html, (Accessed on January 07, 2021)
  18. Google Online Questionnaire, https://www.google.com/intl/ko_kr/forms/about/, (Accessed on January 07, 2021)
  19. Interactive Demo, https://affinelayer.com/pixsrv/, (Accessed on January 07, 2021)
  20. Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou, Demystifying Neural Style Transfer, https://arxiv.org/pdf/1701.01036.pdf, arXiv:1701.01036v2 [cs.CV] 1 Jul 2017
  21. Photorealism, https://en.wikipedia.org/wiki/Photorealism, (Accessed on November 19, 2020)
  22. Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi Twitter, Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, https://arxiv.org/pdf/1609.04802.pdf,%20arXiv:1609.04802v5%20%5bcs.CV%5d%2025%20May%202017.pdf, arXiv:1609.04802v5 [cs.CV] 25 May 2017
  23. Progressive Growing GAN (GAN is gradually learned from low to high resolution), https://ml-dnn.tistory.com/8, (Accessed on November 19, 2020)
  24. Adaptive Instance Normalization (AdaIN), https://m.blog.naver.com/PostView.nhn?blogId=tlqordl89&logNo=221536378926&proxyReferer=https:%2F%2Fwww.google.com%2F, (Accessed on November 19, 2020)
  25. Figurative art, https://en.wikipedia.org/wiki/Figurative_art, (Accessed on February 10, 2021)
  26. Neural Filters, https://helpx.adobe.com/kr/photoshop/using/neural-filters.html, (Accessed on January 07, 2021)