DOI QR코드

DOI QR Code

Deep Learning-based Single Image Generative Adversarial Network: Performance Comparison and Trends

딥러닝 기반 단일 이미지 생성적 적대 신경망 기법 비교 분석

  • 정성훈 (부경대학교 신문방송학과) ;
  • 공경보 (부경대학교 미디어커뮤니케이션학부 휴먼ICT융합전공)
  • Received : 2022.04.18
  • Accepted : 2022.05.12
  • Published : 2022.05.30

Abstract

Generative adversarial networks(GANs) have demonstrated remarkable success in image synthesis. However, since GANs show instability in the training stage on large datasets, it is difficult to apply to various application fields. A single image GAN is a field that generates various images by learning the internal distribution of a single image. In this paper, we investigate five Single Image GAN: SinGAN, ConSinGAN, InGAN, DeepSIM, and One-Shot GAN. We compare the performance of each model and analyze the pros and cons of a single image GAN.

생성적 적대 신경망(GAN, Generative Adversarial Networks)는 이미지 생성 분야에서 주목할 만한 발전을 이루었다. 하지만 큰 데이터 셋에서 불안정한 모습을 보인다는 한계 때문에 다양한 응용 분야에 쉽게 적용하기 어렵다. 단일 이미지 생성적 적대 신경망은 한장의 이미지의 내부 분포를 잘 학습하여 다양한 영상을 생성하는 분야이다. 큰 데이터셋이 아닌 단 한장만 학습함으로써 안정적인 학습이 가능하며 이미지 리타겟팅, 이미지 조작, super resolution 등 다양한 분야에 활용 가능하다. 본 논문에서는 SinGAN, ConSinGAN, InGAN, DeepSIM, 그리고 One-Shot GAN 총 다섯 개의 단일 이미지 생성적 적대 신경망을 살펴본다. 우리는 각각의 단일 이미지 생성적 적대 신경망 모델들의 성능을 비교하고 장단점을 분석한다.

Keywords

Acknowledgement

이 논문은 부경대학교 자율창의학술연구비(2021년)에 의하여 연구되었음.

References

  1. I.J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A.C. Courville, Y. Bengio, "Generative adversarial nets," In Advances in Neural Information Processing Systems, pp.2672-2680, 2014. doi: https://doi.org/10.48550/arXiv.1406.2661
  2. T. Karras, T. Aila, S. Laine, J. Lehtinen, "Progressive Growing of GANs for Improved Quality, Stability, and Variation," arXiv preprint arXiv:1710.10196, 2017. doi: https://doi.org/10.48550/arXiv.1710.10196
  3. X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets," 30th Conference on Neural Information Processing Systems, NIPS, 2016. doi: https://doi.org/10.48550/arXiv.1606.03657
  4. J. Y. Zhu, T. Park, P. Isola, A. A. Efros, "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks," Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp.2223-2232, 2017 doi: http://dx.doi.org/10.1109/ICCV.2017.244
  5. Moon, C. K., Uh, Y. J., Byun, H. R., "Image Restoration using GAN," JOURNAL OF BROADCAST ENGINEERING, pp.503-510, 2018 doi: https://doi.org/10.5909/JBE.2018.23.4.503
  6. Nah, J. Y., Sim, C. H., Park, I. K., "Depth Image Restoration Using Generative Adversarial Network," JOURNAL OF BROADCAST ENGINEERING, pp.614-621, 2018 doi: https://doi.org/10.5909/JBE.2018.23.5.614
  7. Kim, k. h., Kong, K. B., Kang, S. J., "A Review on Deep Learning-based Image Outpainting," JOURNAL OF BROADCAST ENGINEERING, pp.61-69, 2021 doi: https://doi.org/10.5909/JBE.2021.26.1.61
  8. M. Arjovsky, S. Chintala, L. Bottou, "Wasserstein Generative Adversarial Networks," Proceedings of the 34th International Conference on Machine Learning, pp. 214-223, 2017. doi: https://doi.org/10.48550/arXiv.1701.07875
  9. T. Miyato, T. Kataoka, M. Koyama, Y. Yoshida, "Spectral normalization for generative adversarial networks," arXiv preprint arXiv:1802.05957, 2018. doi: https://doi.org/10.48550/arXiv.1802.05957
  10. T.R. Shaham, T. Dekel, T. Michaeli, "Singan: Learning a generative model from a single natural image," Proceedings of the IEEE International Conference on Computer Vision, pp.4570-4580, 2019. doi: https://doi.org/10.48550/arXiv.1905.01164
  11. T.Hinz, M. Fisher, O. Wang, S. Wermter, "Improved techniques for training single-image gans," In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp.1300-1309, 2021. doi: http://dx.doi.org/10.1109/WACV48630.2021.00134
  12. A. Shocher, S. Bagon, P. Isola, M. Irani, "Ingan: Capturing and remapping the" dna" of a natural image," In IEEE International Conference on Computer Vision (ICCV) 2019. doi: http://dx.doi.org/10.1109/ICCV.2019.00459
  13. Y. Vinker, E. Horwitz, N. Zabari, "Image Shape Manipulation from a Single Augmented Training Sample," Proceedings of the ICCV, 2021. doi: http://dx.doi.org/10.1109/ICCV48922.2021.01351
  14. V. Sushko, J. Gall, A. Khoreva, "One-shot gan: Learning to generate samples from single images and videos," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.2596-2600. 2021. doi: http://dx.doi.org/10.1109/CVPRW53098.2021.00293
  15. X. Mao, Q. Li, H. Xie, R.Y.K. Lau, Z. Wang, "Least squares generative adversarial networks," Proceedings of the ICCV, 2016. doi: http://dx.doi.org/10.1109/ICCV.2017.304
  16. D. Berthelot, T. Schumm, L. Metz, "BEGAN: Boundary Equilibrium Generative Adversarial Networks," arXiv preprint arXiv:1703.10717, 2017. doi: https://doi.org/10.48550/arXiv.1703.10717
  17. A. Radford, L. Metz, S. Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks," arXiv preprint arXiv:1511.06434, 2015. doi: https://doi.org/10.48550/arXiv.1511.06434
  18. H. Zhang, I. Goodfellow, D. Metaxas, "Self-Attention Generative Adversarial Networks," Proceedings of the 36th International Conference on Machine Learning, pp.7354-7363, 2019. doi: https://doi.org/10.48550/arXiv.1805.08318
  19. G. Donato, S. Belongie, "Approximate thin plate spline mappings," In European conference on computer vision, pp.21-31, 2002. doi: http://dx.doi.org/10.1007/3-540-47977-5_2
  20. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, S. Hochreiter, "Gans trained by a two time-scale update rule converge to a local nash equilibrium," In Advances in Neural Information Processing Systems, pp.6626-6637, 2017. doi: https://dl.acm.org/doi/10.5555/3295222.3295408
  21. J. Lin, Y. Pang, Y. Xia, Z. Chen, J. Luo, "Tuigan: Learning versatile image-to-image translation with two unpaired images," European Conference on Computer Vision. pp.18-35. 2020. doi: http://dx.doi.org/10.1007/978-3-030-58548-8_2
  22. R. Zhang, P.Isola, A. A. Efros, E. Shechtman, O. Wang, "The unreasonable effectiveness of deep features as a perceptual metric," arXiv preprint arXiv:1801.03924, 2018. doi: http://dx.doi.org/10.1109/CVPR.2018.00068
  23. T. C. Wang, M. Y. Liu, J. Y. Zhu, A. Tao, J. Kautz, B. Catanzaro, "High-resolution image synthesis and semantic manipulation with conditional gans," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.8798-8807, 2018 doi: http://dx.doi.org/10.1109/CVPR.2018.00917
  24. https://github.com/tamarott/SinGAN
  25. https://github.com/tohinz/ConSinGAN
  26. https://github.com/assafshocher/InGAN
  27. https://github.com/eliahuhorwitz/DeepSIM