DOI QR코드

DOI QR Code

Counterfactual image generation by disentangling data attributes with deep generative models

  • Jieon Lim (Department of Statistics, EWHA Womans University) ;
  • Weonyoung Joo (Department of Statistics, EWHA Womans University)
  • Received : 2023.02.19
  • Accepted : 2023.09.23
  • Published : 2023.11.30

Abstract

Deep generative models target to infer the underlying true data distribution, and it leads to a huge success in generating fake-but-realistic data. Regarding such a perspective, the data attributes can be a crucial factor in the data generation process since non-existent counterfactual samples can be generated by altering certain factors. For example, we can generate new portrait images by flipping the gender attribute or altering the hair color attributes. This paper proposes counterfactual disentangled variational autoencoder generative adversarial networks (CDVAE-GAN), specialized for data attribute level counterfactual data generation. The structure of the proposed CDVAE-GAN consists of variational autoencoders and generative adversarial networks. Specifically, we adopt a Gaussian variational autoencoder to extract low-dimensional disentangled data features and auxiliary Bernoulli latent variables to model the data attributes separately. Also, we utilize a generative adversarial network to generate data with high fidelity. By enjoying the benefits of the variational autoencoder with the additional Bernoulli latent variables and the generative adversarial network, the proposed CDVAE-GAN can control the data attributes, and it enables producing counterfactual data. Our experimental result on the CelebA dataset qualitatively shows that the generated samples from CDVAE-GAN are realistic. Also, the quantitative results support that the proposed model can produce data that can deceive other machine learning classifiers with the altered data attributes.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2022-00166289).

References

  1. Bao J, Chen D, Wen F, Li H, and Hua G (2017). CVAE-GAN: Fine-grained image generation through asymmetric training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Venice, Italy, 2745-2754.
  2. Dupont E (2018). Learning disentangled joint continuous and discrete representations, Advances in Neural Information Processing Systems (NeurIPS), 31, 710-720.
  3. Gatys LA, Ecker AS, and Bethge M (2016). Image style transfer using convolutional neural networks, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2414-2423.
  4. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, and Bengio Y (2014). Generative adversarial nets, Advances in Neural Information Processing Systems (NeurIPS), 27, 2672-2680.
  5. Heusel M, Ramsauer H, Unterthiner T, Nessler B, and Hochreiter S (2017). GANs trained by a two time-scale update rule converge to a local nash equilibrium, Advances in Neural Information Processing Systems (NeurIPS), 30, 6626-6637.
  6. Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, and Lerchner A (2017). Beta-VAE: Learning basic visual concepts with a constrained variational framework, In Proceedings of The International Conference on Learning Representations (ICLR), Toulon, France.
  7. Jang E, Gu S, and Poole B (2017). Categorical reparameterization with Gumbel-Softmax. In Proceedings of The International Conference on Learning Representations (ICLR), Toulon, France.
  8. Joo W, Lee W, Park S, and Moon IC (2020a). Dirichlet variational autoencoder, Pattern Recognition, 107, 107514.
  9. Joo W, Kim D, Shin S, and Moon IC (2020b). Generalized Gumbel-Softmax gradient estimator for various discrete random variables, Available from: https://arxiv.org/abs/2003.01847
  10. Kim D, Song K, Shin S, Kang W, Moon IC, and Joo W (2021). Neural posterior regularization for likelihood-free inference, Available from: https://arxiv.org/abs/2102.07770
  11. Kingma DP and Welling M (2014). Auto-Encoding variational Bayes, In Proceedings of The International Conference on Learning Representations (ICLR), Banff, Canada.
  12. Kusner MJ, Loftus J, Russell C, and Silva R (2017). Counterfactual fairness, Advances in Neural Information Processing Systems (NeurIPS), 30, 4066-4076.
  13. Larsen ABL, Sonderby SK, Larochelle H, and Winther O (2016). Autoencoding beyond pixels using a learned similarity metric, In International Conference on Machine Learning (PMLR), 48, 1558-1566.
  14. Liu Z, Luo P, Wang X, and Tang X (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 3730-3738.
  15. Nalisnick E and Smyth P (2017). Stick-Breaking variational autoencoders, The International Conference on Learning Representations (ICLR).
  16. Neal L, Olson M, Fern X, Wong WK, and Li F (2018). Open set learning with counterfactual images, In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 613-628.
  17. Radford A, Metz L, and Chintala S (2016). Unsupervised representation learning with deep convolutional generative adversarial networks, In Proceedings of The International Conference on Learning Representations (ICLR), San Juan, Puerto Rico.
  18. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, and Chen X (2016). Improved techniques for training GANs, Advances in Neural Information Processing Systems (NeurIPS), 29, 2234-2242.
  19. Sohn K, Lee H, and Yan X (2015). Learning structured output representation using deep conditional generative models, Advances in Neural Information Processing Systems (NeurIPS), 28, 3483-3491.
  20. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, and Wojna Z (2016). Rethinking the inception architecture for computer vision, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2818-2826.
  21. Yang M, Liu F, Chen Z, Shen X, Hao J, and Wang J (2021). CausalVAE: Disentangled representation learning via neural structural causal models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 9588-9597.