Browse > Article
http://dx.doi.org/10.3837/tiis.2021.10.012

Semi-Supervised Spatial Attention Method for Facial Attribute Editing  

Yang, Hyeon Seok (Department of Computer Science and Engineering, Hanyang University)
Han, Jeong Hoon (Department of Computer Science and Engineering, Hanyang University)
Moon, Young Shik (Department of Computer Science and Engineering, Hanyang University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.10, 2021 , pp. 3685-3707 More about this Journal
Abstract
In recent years, facial attribute editing has been successfully used to effectively change face images of various attributes based on generative adversarial networks and encoder-decoder models. However, existing models have a limitation in that they may change an unintended part in the process of changing an attribute or may generate an unnatural result. In this paper, we propose a model that improves the learning of the attention mask by adding a spatial attention mechanism based on the unified selective transfer network (referred to as STGAN) using semi-supervised learning. The proposed model can edit multiple attributes while preserving details independent of the attributes being edited. This study makes two main contributions to the literature. First, we propose an encoder-decoder model structure that learns and edits multiple facial attributes and suppresses distortion using an attention mask. Second, we define guide masks and propose a method and an objective function that use the guide masks for multiple facial attribute editing through semi-supervised learning. Through qualitative and quantitative evaluations of the experimental results, the proposed method was proven to yield improved results that preserve the image details by suppressing unintended changes than existing methods.
Keywords
Facial attribute editing; spatial attention mechanism; semi-supervised learning; generative adversarial network; STGAN;
Citations & Related Records
연도 인용수 순위
  • Reference
1 I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets," Advances in Neural Information Processing Systems, pp. 1-9, 2014.
2 W. Shen and R. Liu, "Learning residual images for face attribute manipulation," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1225-1233, Jul. 2017.
3 Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen, "AttGAN: facial attribute editing by only changing what you want," IEEE Transactions on Image Processing, vol. 28, no. 11, pp. 5464-5478, Nov. 2019.   DOI
4 P. Chen, Q. Xiao, J. Xu, X. Dong, and L. Sun, "Facial attribute editing using semantic segmentation," in Proc. of 2019 Int. Conf. on High Performance Big Data and Intelligent Systems (HPBD&IS), pp. 97-103, May 2019.
5 I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. Courville, "Improved training of Wasserstein GANs," Advances in Neural Information Processing Systems, pp. 5767-5777, Dec. 2017.
6 P. Isola, J. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 5967-5976, Jul. 2017.
7 M. Liu, Y. Ding, M. Xia, X. Liu, E. Ding, W. Zuo, and S. Wen, "STGAN: a unified selective transfer network for arbitrary image attribute editing," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3668-3677, Jun. 2019.
8 K. Zhang, Y. Su, X. Guo, L. Qi, and Z. Zhao, "MU-GAN: Facial Attribute Editing Based on Multi-Attention Mechanism," IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 9, pp. 1614-1626, Sep. 2021.   DOI
9 B. Diallo, J. Hu, T. Li, G. A. Khan, and A. S. Hussein, "Multi-view document clustering based on geometrical similarity measurement," Int. J. Mach. Learn. & Cyber, pp. 1-13, Mar. 2021.
10 S. Minaee, Y. Y. Boykov, F. Porikli, A. J. Plaza, N. Kehtarnavaz, and D. Terzopoulos, "Image segmentation using deep learning: a survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
11 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 770-778, Jun. 2016.
12 X. Chen, C. Xu, X. Yang, and D. Tao, "Attention-GAN for object transfiguration in wild images," in Proc. of the European Conf. on Computer Vision (ECCV), pp. 167-184, Oct. 2018.
13 J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization," Journal of Machine Learning Research, vol. 13, pp. 281-305, Feb. 2012.
14 G. A. Khan, J. Hu, T. Li, B. Diallo, and Y. Zhao, "Multi-view low rank sparse representation method for three-way clustering," Int. J. Mach. Learn. & Cyber, pp. 1-21, Aug. 2021.
15 G. Zhang, M. Kan, S. Shan, and X. Chen, "Generative adversarial network with spatial attention for facial attribute editing," in Proc. of the European Conf. on Computer Vision (ECCV), pp. 422-437, Oct. 2018.
16 X. Zheng, Y. Guo, H. Huang, Y. Li, and R. He, "A survey to deep facial attribute analysis," Int. Journal of Computer Vision, vol. 128, pp. 2002-2034, Mar. 2020.   DOI
17 G. A. Khan, J. Hu, T. Li, B. Diallo, and H. Wang, "Multi-view data clustering via non-negative matrix factorization with manifold regularization," Int. J. Mach. Learn. & Cyber, pp. 1-13, Mar. 2021.
18 Z. Wei, H. Bai, and Y. Zhao, "Stage-GAN with semantic maps for large-scale image superresolution," KSII Transactions on Internet and Information Systems, vol. 13, no. 8, pp. 3942-3961, Aug. 2019.   DOI
19 S. Hong, S. Kim, and S. Kang, "Game sprite generator using a multi discriminator GAN," KSII Transactions on Internet and Information Systems, vol. 13, no. 8, pp. 4255-4269, Aug. 2019.   DOI
20 M. Mirza and S. Osindero, "Conditional generative adversarial nets," arXiv preprint arXiv:1411.1784, pp. 1-7, Nov. 2014.
21 G. Perarnau, J. V. D. Weijer, B. Raducanu, and J. M. Alvarez, "Invertible conditional GANs for image editing," in Proc. of NIPS 2016 Workshop on Adversarial Training, pp. 1-9, Nov. 2016.
22 Y. Choi, M. Choi, M. Kim, J. Ha, S. Kim, and J. Choo, "StarGAN: unified generative adversarial networks for multi-domain image-to-image translation," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 8789-8797, Jun. 2018.
23 B. Diallo, J. Hu, T. Li, and G. A. Khan, "Deep Embedding Clustering Based on Contractive Autoencoder," Neurocomputing, vol. 433, pp. 96-107, Jan. 2021.   DOI
24 C. Lee, Z. Liu, L. Wu, and P. Luo, "MaskGAN: towards diverse and interactive facial image manipulation," in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 5548-5557, Jun. 2020.
25 M. Arjovsky, S. Chintala, and L. Bottou, "Wasserstein generative adversarial networks," in Proc. of the 34th Int. Conf. on Machine Learning, vol. 70, pp. 214-223, 2017.
26 Z. Liu, P. Luo, X. Wang, and X. Tang, "Deep learning face attributes in the wild," in Proc. of the IEEE Int. Conf. on Computer Vision, pp. 3730-3738, Dec. 2015.
27 Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004.   DOI
28 C. A. Floudas and P. M. Pardalos, Encyclopedia of optimization, Boston, MA, USA: Springer, 2009.
29 Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen, TensorFlow implementation of AttGAN: Facial Attribute Editing by Only Changing What You Want, 2019, [Online]. Available: https://github.com/LynnHo/AttGAN-Tensorflow/tree/v1
30 C. Hu, X. Wu, and Z. Shu, "Bagging deep convolutional autoencoders trained with a mixture of real data and GAN-generated data," KSII Transactions on Internet and Information Systems, vol. 13, no. 11, pp. 5427-5445, Nov. 2019.   DOI
31 M. Liu, Y. Ding, M. Xia, X. Liu, E. Ding, W. Zuo, and S. Wen, Tensorflow implementation of STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing, 2019, [Online]. Available: https://github.com/csmliu/STGAN
32 J. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycleconsistent adversarial networks," in Proc. of the IEEE Int. Conf. on Computer Vision, pp. 2242-2251, Oct. 2017.