DOI QR코드

DOI QR Code

Facial Image Synthesis by Controlling Skin Microelements

피부 미세요소 조절을 통한 얼굴 영상 합성

  • Kim, Yujin (Inha University, Department of Electrical & Computer Engineering) ;
  • Park, In Kyu (Inha University, Department of Electrical & Computer Engineering)
  • 김유진 (인하대학교 전기컴퓨터공학과) ;
  • 박인규 (인하대학교 전기컴퓨터공학과)
  • Received : 2022.03.30
  • Accepted : 2022.04.07
  • Published : 2022.05.30

Abstract

Recent deep learning-based face synthesis research shows the result of generating a realistic face including overall style or elements such as hair, glasses, and makeup. However, previous methods cannot create a face at a very detailed level, such as the microstructure of the skin. In this paper, to overcome this limitation, we propose a technique for synthesizing a more realistic facial image from a single face label image by controlling the types and intensity of skin microelements. The proposed technique uses Pix2PixHD, an Image-to-Image Translation method, to convert a label image showing the facial region and skin elements such as wrinkles, pores, and redness to create a facial image with added microelements. Experimental results show that it is possible to create various realistic face images reflecting fine skin elements corresponding to this by generating various label images with adjusted skin element regions.

최근 딥러닝 기반의 얼굴 합성 연구는 전체적인 스타일이나 헤어, 안경, 화장과 같은 요소를 포함하는 매우 사실적인 얼굴을 생성하는 결과를 보인다. 그러나 피부의 미세 구조와 같은 매우 세부적인 수준의 얼굴은 생성하지 못한다. 본 논문에서는 이러한 한계점을 극복하고자 한 장의 얼굴 라벨 영상으로부터 피부 미세 요소의 종류와 강도 조절을 통해 더욱 사실적인 얼굴 영상을 합성하는 기법을 제안한다. 제안하는 기법은 Image-to-Image Translation 방법인 Pix2PixHD를 이용해 얼굴 영역과 피부 요소인 주름, 모공, 홍조가 표시된 라벨 영상을 변환하여 미세 요소가 추가된 얼굴 영상을 생성한다. 피부 요소 영역을 조절한 라벨 영상을 다양하게 생성함으로써 이에 대응하는 미세한 피부 요소가 반영된 다양한 사실적인 얼굴 영상을 생성할 수 있음을 실험을 통해 보인다.

Keywords

Acknowledgement

이 논문은 2022년도 정부(과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원(2020-0-01389, 인공지능융합연구센터지원(인하대학교))과 2022년도 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (NRF-2019R1A2C1006706).

References

  1. Mark-Vu, https://www.psiplus.co.kr/page/?M2_IDX=8160 (accessed Mar. 15, 2022).
  2. I. J. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets," Proc. Advances in Neural Information Processing Systems, December 2014.
  3. M. Mirza and S. Osindero, "Conditional generative adversarial nets," arXiv preprint arXiv:1411.1784, 2014.
  4. Y. Choi, M. Choi, M. Kim, J. -W. Ha, S. Kim, and J. Choo, "StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation," Proc. IEEE Conference on Computer Vision and Pattern Recognition, June 2018. doi: https://doi.org/10.1109/cvpr.2018.00916
  5. Y. Shen, J. Gu, X. Tang, and B. Zhou, "Interpreting the latent space of GANs for semantic face editing," Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/cvpr42600.2020.00926
  6. E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, and D. Cohen-Or, "Encoding in style: A StyleGAN encoder for image-to-image translation," Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021. doi: https://doi.org/10.1109/cvpr46437.2021.00232
  7. Z. Huang, S. Chen, J. Zhang, and H. Shan, "PFA-GAN: Progressive face aging with generative adversarial network," IEEE Trans. on Information Forensics and Security, Vol. 16, pp. 2031-2045, December 2020. doi: https://doi.org/10.1109/tifs.2020.3047753
  8. T. -C. Wang, M. -Y. Liu, J. -Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, "High-resolution image synthesis and semantic manipulation with conditional GANs," Proc. IEEE Conference on Computer Vision and Pattern Recognition, June 2018. doi: https://doi.org/10.1109/cvpr.2018.00917
  9. P. Isola, J. -Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," Proc. IEEE Conference on Computer Vision and Pattern Recognition, June 2017. doi: https://doi.org/10.1109/cvpr.2017.632
  10. U. Demir and G. Unal, "Patch-based image inpainting with generative adversarial networks," arXiv preprint arXiv:1803.07422, 2018.
  11. J. -Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," Proc. IEEE International Conference on Computer Vision, October 2017. doi: https://doi.org/10.1109/iccv.2017.244
  12. T. Karras, T. Alia, S. Laine, and J. Lehtinen, "Progressive growing of GANs for improved quality, stability, and variation," Proc. International Conference on Learning Representations, April 2018.
  13. T. Karras, S. Lanine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, "Analyzing and improving the image quality of StyleGAN," Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/cvpr42600.2020.00813
  14. T. Karras, M. Aittala, J. Hellsten, S. Lanine, J. Lehtinen, and T. Aila, "Training generative adversarial networks with limited data," Proc. Advances in Neural Information Processing Systems, December 2020.
  15. G. Perarnau, J. van de Weijer, B. Raducanu, and J. M. Alvarez, "Invertible conditional GANs for image editing," arXiv preprint arXiv:1611.06355, 2016.
  16. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, Vol. 86, No. 11, pp. 2278-2324, November 1998. doi: https://doi.org/10.1109/5.726791
  17. Z. Liu, P. Luo, X. Wang, and X. Tang, "Deep learning face attributes in the wild," Proc. IEEE International Conference on Computer Vision, December 2015. doi: https://doi.org/10.1109/iccv.2015.425
  18. Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen, "AttGAN: Facial attribute editing by only changing what you want," IEEE Trans. on Image Processing, Vol. 28, No. 11, pp. 5464-5478, May 2019. doi: https://doi.org/10.1109/tip.2019.2916751
  19. C. -H. Lee, Z. Liu, L. Wu, and P. Luo, "MaskGAN: Towards diverse and interactive facial image manipulation," Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020. doi: https://doi.org/10.1109/cvpr42600.2020.00559
  20. C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang, "Bisenet: Bilateral segmentation network for real-time semantic segmentation," Proc. European Conference on Computer Vision, September 2018. doi: https://doi.org/10.1007/978-3-030-01261-8_20