DOI QR코드

DOI QR Code

Super High-Resolution Image Style Transfer

초-고해상도 영상 스타일 전이

  • Kim, Yong-Goo (Dept. of AI Software Eng., Seoul Media Institute of Technology)
  • 김용구 (서울미디어대학원대학교 인공지능응용소프트웨어학과)
  • Received : 2021.10.25
  • Accepted : 2021.12.12
  • Published : 2022.01.30

Abstract

Style transfer based on neural network provides very high quality results by reflecting the high level structural characteristics of images, and thereby has recently attracted great attention. This paper deals with the problem of resolution limitation due to GPU memory in performing such neural style transfer. We can expect that the gradient operation for style transfer based on partial image, with the aid of the fixed size of receptive field, can produce the same result as the gradient operation using the entire image. Based on this idea, each component of the style transfer loss function is analyzed in this paper to obtain the necessary conditions for partitioning and padding, and to identify, among the information required for gradient calculation, the one that depends on the entire input. By structuring such information for using it as auxiliary constant input for partition-based gradient calculation, this paper develops a recursive algorithm for super high-resolution image style transfer. Since the proposed method performs style transfer by partitioning input image into the size that a GPU can handle, it can perform style transfer without the limit of the input image resolution accompanied by the GPU memory size. With the aid of such super high-resolution support, the proposed method can provide a unique style characteristics of detailed area which can only be appreciated in super high-resolution style transfer.

신경망 기반 스타일 전이 기법은 영상의 고차원적 구조적 특징을 반영하여 높은 품질의 스타일 전이 결과를 제공함으로써 최근 크게 주목받고 있다. 본 논문은 이러한 신경망 기반 스타일 전이의 GPU 메모리 제한에 따른 해상도 한계에 대한 문제를 다룬다. 신경망 출력이 가진 제한적 수용장 특징을 바탕으로, 부분 영상 기반의 스타일 전이 손실함수 경사도 연산이 전체 영상을 대상으로 구한 경사도 연산과 동일한 결과를 생성할 수 있을 것으로 기대할 수 있다. 이러한 아이디어를 기반으로, 본 논문에서는, 스타일 전이 손실함수의 각 구성 요소에 대한 경사도 연산 구조를 분석하고, 이를 통해 부분 영상의 생성 및 패딩에 대한 필요조건을 구하고, 전체 영상의 신경망 출력에 좌우되는 경사도 연산 요구 데이터를 확인하여 구조화함으로써 재귀적 초고해상도 스타일 전이 알고리즘을 개발하였다. 제안된 기법은, 사용하는 GPU 메모리가 처리할 수 있는 크기로 초고해상도 입력을 분할하여 스타일 전이를 수행함으로써, GPU 메모리 한계에 따른 해상도 제한을 받지 않으며, 초고해상도 스타일 전이에서만 감상할 수 있는 독특한 세부 영역의 전이 스타일 특징을 제공할 수 있다.

Keywords

Acknowledgement

This research is supported by Ministry of Culture, Sports and Tourism and Korea Creative Content Agency (Project Number: R2020040238).

References

  1. P. Rosin and J. Collomosse, "Image and video-based artistic stylization", Springer-Verlag, London, 2013.
  2. T. Strothotte and S. Schlechtweg, "Non-photorealistic computer graphics: modeling, rendering, and animation", Elsevier Science, USA, 2002.
  3. D. Heeger and J. Bergen, "Pyramid-based texture analysis/synthesis," Proc. of the 22nd annual conf. on computer graphics and interactive techniques, pp.229-238, 1995. doi:10.1145/218380.218446.
  4. A. Efros and T. Leung, "Texture synthesis by non-parametric sampling," Proc. of the IEEE Int. Conf. on Computer Vision(ICCV), pp.1033-1038, 1999. doi:10.1109/ICCV.1999.790383.
  5. L. A. Gatys, A. S. Ecker, and M. Bethge, "Image style transfer using convolutional neural networks," Proc. of the IEEE int. Conf. on Computer Vis. and Patt. Recog.(CVPR), pp.2414-2423, 2016. doi:10.1109/CVPR.2016.265.
  6. Y. Li, N. Wang, J. Liu, and X. Hou, "Demystifying neural style transfer," Proc. of the 26th Int. Joint Conf. on Artificial Intelligence(IJCAI), pp. 2230-2236, 2017. doi:10.5555/3172077.3172198.
  7. E. Risser, P. Wilmot, and C. Barnes, "Stable and controllable neural texture synthesis and style transfer using histogram losses," arXiv preprint arXiv:1701.08893, 2018.
  8. Choi and Y.-G. Kim, "A normalized loss function of style transfer network for more diverse and more stable transfer results," J. of Broadcast Engineering, vol.25, no.6, pp.980-993, 2020. doi:10.5909/JBE.2020.25.6.980.
  9. H. Huang, H. Wang, W. Luo, L. Ma, W. Jiang, X. Zhu, Z. Li, and W. Liu, "Real-time neural style transfer for videos," Proc. of the IEEE Conf. on Computer Vis. and Patt. Recog.(CVPR), pp. 7044-7052, 2017. doi: 10.1109/CVPR.2017.745.
  10. M. Ruder, A. Dosovitskiy, and T. Brox, "Artistic style transfer for videos and spherical images," Int. J. of Computer Vision, vol.126, no.11, pp.1199-1219, 2018. doi:10.1007/s11263-018-1089-z.
  11. J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," Proc. of the European Conf. on Computer Vision(ECCV), pp.694-711, 2016. doi:10.1007/978-3-319-46475-6_43.
  12. A. Sanakoyeu, D. Kotovenko, S. Lang, and B. Ommer, "A style-aware content loss for real-time HD style transfer," Proc. of the European Conf. on Computer Vision(ECCV), pp.698-714, 2018. doi:10.1007/978-3-030-01237-3_43.
  13. V. Dumoulin, J. Shlens, and M. Kudlur, "A learned representation for artistic style," Proc. of the Int. Conf. on Learning Representations, arXiv preprint arXiv:1610.07629, 2016.
  14. H. Zhang and K. Dana, "Multi-style generative network for real-time transfer," arXiv preprint arXiv:1703.06953, 2017.
  15. X. Huang and S. Belongie, "Arbitrary style transfer in real-time with adaptive instance normalization," Proc. of the IEEE Int. Conf. on Computer Vision(ICCV), pp.1510-1519, 2017. doi:10.1109/ICCV.2017.167.
  16. Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M. Yang, "Universal style transfer via feature transforms," arXiv preprint arXiv:1705.08086, 2017.
  17. L. Sheng, Z. Lin, J. Shao, and X. Wang, "Avartar-net: multi-scale zero- shot style transfer by feature decoration," arXiv preprint arXiv:1805.03857, 2018.
  18. Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, and M. Song, "Neural Style Transfer: A Review," IEEE Trans. on Visualization and Computer Graphics, vol.26, no.11, pp.3365-3385, 2020. doi:10.1109/TVCG.2019.2921336.
  19. L. Gatys, A. Ecker, M. Bethge, A. Hertzman, and E. Shechtman, "Controlling perceptual factors in neural style transfer," arXiv preprint arXiv: 1611.07865, 2016.
  20. K. Simonyan and A. Zissermann, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556v6, 2015.
  21. H. Wang, Y. Li, H. Ju, and M.-H. Yang, "Collaborative distillation for ultra-resolution universal style transfer," arXiv preprint arXiv:2003.08436, 2020.
  22. R. Hecht-Nielsen, "Theory of the backpropagation neural network," in Proc. of the Int. Joint Conf. on Neural Networks(IJCNN), pp. 593-605, 1989. doi: 10.1109/IJCNN.1989.118638.
  23. D. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980v9, 2014.
  24. M. Abadi, et. al., "Tensorflow: A system for large-scale machine learning," arXiv preprint arXiv:1605.08695, 2016.
  25. Fast photo style, https://github.com/NVIDIA/FastPhotoStyle (accessed 13 Dec. 2021)
  26. Neural style transfer using tf.keras, https://www.tensorflow.org/tutorials/generative/style_transfer (accessed 13 Dec. 2021)