[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2021.06.010

A Novel Cross Channel Self-Attention based Approach for Facial Attribute Editing

Xu, Meng (School of Computer Science & Technology, Tiangong University)
Jin, Rize (School of Computer Science & Technology, Tiangong University)
Lu, Liangfu (School of Medical College, Tianjin University)
Chung, Tae-Sun (Department of Artificial Intelligence Ajou University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.6, 2021 , pp. 2115-2127 More about this Journal

Abstract

Although significant progress has been made in synthesizing visually realistic face images by Generative Adversarial Networks (GANs), there still lacks effective approaches to provide fine-grained control over the generation process for semantic facial attribute editing. In this work, we propose a novel cross channel self-attention based generative adversarial network (CCA-GAN), which weights the importance of multiple channels of features and archives pixel-level feature alignment and conversion, to reduce the impact on irrelevant attributes while editing the target attributes. Evaluation results show that CCA-GAN outperforms state-of-the-art models on the CelebA dataset, reducing Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) by 15~28% and 25~100%, respectively. Furthermore, visualization of generated samples confirms the effect of disentanglement of the proposed model.

Keywords

Generative Adversarial Network; Cross Channel Self-Attention; Image Translation; Style Transfer; Facial Attribute Editing;

Citations & Related Records

Reference

1	Vaswani. A, Shazeer. N, Parmar. N, Uszkoreit. J, Jones. L, Gomez. A. N, Kaiser. L. G, and Polosukhin. I, "Attention is all you need," in Proc. of International Conference on Neural Information Processing Systems, vol. 30, pp. 6000-6010, December, 2017.
2	Zhu. J.-Y, Park. T, Isola. P, and Efros. A. A, "Unpaired image-to-image translation using cycleconsistent adversarial networks," in Proc. of IEEE International Conference on Computer Vision (ICCV), Honolulu, HI, USA, pp. 2242-2251, 2017.
3	Wang. T.-C, Liu. M.-Y, Zhu. J.-Y, Liu. G, Tao. A, Kautz. J, and Catanzaro. B, "Video-to-video synthesis," in Proc. of the 32nd International Conference on Neural Information Processing Systems, pp. 1152-1164, 2018.
4	Liu. Z, Luo. P, Wang. X, and Tang. X, "Deep learning face attributes in the wild," in Proc. of IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 3730-3738, December, 2015.
5	Choi. Y, Choi. M, Kim. M, Ha. J. W, Kim. S, and Choo. J, "Stargan: Unified generative adversarial networks for multi-domain image-to-image translation," in Proc. of IEEE Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 8789-8797, June, 2018.
6	Kingma. D. P, and Ba. J. L, "Adam: A method for stochastic optimization," in Proc. of International Conference on Learning Representations, San Diego, CA, USA, 2015.
7	Heusel. M, Ramsauer. H, Unterthiner. T, Nessler. B, and Hochreiter. S, "GANs trained by a two time-scale update rule converge to a local nash equilibrium," Advances in Neural Information Processing Systems, vol. 30, pp. 6626-6637, 2017.
8	Binkowski. M, Sutherland. D. J, Arbel. M, and Gretton. A, "Demystifying mmd gans," in Proc. of International Conference on Learning Representations, Vancouver, Canada, 2018.
9	Choi. Y, Uh. Y, Yoo. J, and Ha. J.-W, "StarGAN v2: Diverse image synthesis for multiple domains," in Proc. of IEEE/CVF Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 8188-8197, June, 2020.
10	Odena. A, Olah. C, Shlens. J, "Conditional image synthesis with auxiliary classifier gans," in Proc. of International conference on machine learning, Sydney, Australia, pp. 2642-2651, Augus, 2017.
11	Yu. J, Lin. Z, Yang. J, Shen. X, Lu. X, and Huang. T, "Free-form image inpainting with gated convolution," in Proc. of IEEE/CVF International Conference on Computer Vision (ICCV), Seoul Korea, pp. 4470-4479, Nov, 2019.
12	Yeh. R. A, Chen. C, Lim. T. Y, Schwing. A. G, Hasegawa-Johnson. M, and Do. M. N, "Semantic image inpainting with deep generative models," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6882-6890, July, 2017.
13	Wang. T.-C, Liu. M.-Y, Tao. A, Liu. G, Catanzaro. B, and Kautz. J, "Few-shot video-to-video synthesis," Advances in Neural Information Processing Systems, vol. 32, pp. 5013-5024, December, 2019.
14	He. K, Zhang. X, Ren. S, and Sun. J, "Deep residual learning for image recognition," in Proc. of Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770-778, June, 2016.
15	Bahdanau. D, Cho. K, and Bengio. Y, "Neural machine translation by jointly learning to align and translate," in Proc. of International Conference on Learning Representations, San Diego, CA, USA, 2014.
16	Wang. X, Yu. K, Wu. S, Gu. J, Liu. Y, Dong. C, Qiao. Y, and Loy. C. C, "ESRGAN: Enhanced super-resolution generative adversarial networks," in Proc. of ECCV Workshops, Germany, pp. 63-79, September, 2018.
17	Ma. X, Jin. R, Sohn. K. A, Paik. J. Y, and Chung. T. S, "An Adaptive Control Algorithm for Stable Training of Generative Adversarial Networks," IEEE Access, vol. 7, pp. 184103-184114, 2019. DOI
18	Lample. G, Zeghidour. N, Usunier. N, Bordes. A, Denoyer. L, and Ranzato. M. A, "Fader networks: Manipulating images by sliding attributes," in Proc. of Neural Information Processing Systems (NIPS 2017), vol. 30, pp. 5969-5978, 2017.
19	Karras. T, Laine. S, and Aila. T, "A style-based generator architecture for generative adversarial networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1-1, 2020.
20	Zhang. H, Goodfellow. I. J, Metaxas. D. N, and Odena. A, "Self-attention generative adversarial networks," in Proc. of ICML, pp. 7354-7363, July, 2018.
21	Fu. J, Liu. J, Tian. H, Li. Y, Bao. Y, Fang. Z, and Lu. H, "Dual attention network for scene segmentation," in Proc. of IEEE/CVF Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 3146-3154, June, 2019.
22	Arjovsky. M, Chintala. S, Bottou. L, "Wasserstein generative adversarial networks," in Proc. of International Conference on Machine Learning, Sydney, Australia, pp. 214-223, August, 2017.
23	Gulrajani. I, Ahmed. F, Arjovsky. M, Dumoulin. V, and Courville. A, "Improved training of wasserstein GANs," in Proc. of International Conference on Neural Information Processing Systems, vol. 30, pp. 5769-5779, 2017.
24	Miyato. T, Kataoka. T, Koyama. M, and Yoshida. Y, "Spectral normalization for generative adversarial networks," in Proc. of International Conference on Learning Representations, Vancouver, Canada, 2018.
25	Ledig. C, Theis. L, Huszar. F, Caballero. J, Cunningham. A, Acosta. A, Aitken. A, Tejani. A, Totz. J, and Wang. Z, "Photo-realistic single image super-resolution using a generative adversarial network," in Proc. of IEEE Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 105-114, July, 2017.
26	Bau. D, Strobelt. H, Peebles. W, Wulff. J, Zhou. B, Zhu. J.-Y. and Torralba. A, "Semantic photo manipulation with a generative image prior," ACM Transactions on Graphics, vol. 38, no. 4, pp. 59, July, 2019.
27	Karras. T, Laine. S, Aittala. M, Hellsten. J, Lehtinen. J, and Aila. T, "Analyzing and improving the image quality of stylegan," in Proc. of IEEE/CVF Computer Vision and Pattern Recognition, United States, pp. 8110-8119, June, 2020.
28	Goodfellow. I, Pouget-Abadie. J, Mirza. M, Xu. B, Warde-Farley. D, Ozair. S, Couville. A, and Bengio. Y, "Generative adversarial networks," Communications of the ACM, vol. 63, no. 11, 2020.