Injection of Cultural-based Subjects into Stable Diffusion Image Generative Model

Amirah Alharbi;Reem Alluhibi;Maryam Saif;Nada Altalhi;Yara Alharthi;

doi:10.22937/IJCSNS.2024.24.2.1

International Journal of Computer Science & Network Security

제24권2호
/
Pages.1-14
/
2024
/
1738-7906(pISSN)

국제컴퓨터통신보호논문지학회 (International Journal of Computer Science & Network Security)

DOI QR Code

Injection of Cultural-based Subjects into Stable Diffusion Image Generative Model

Amirah Alharbi (Department of Computer Science and Artificial Intelligence, College of Computing, Umm Alqura University) ;
Reem Alluhibi (Department of Computer Science and Artificial Intelligence, College of Computing, Umm Alqura University) ;
Maryam Saif (Department of Computer Science and Artificial Intelligence, College of Computing, Umm Alqura University) ;
Nada Altalhi (Department of Computer Science and Artificial Intelligence, College of Computing, Umm Alqura University) ;
Yara Alharthi (Department of Computer Science and Artificial Intelligence, College of Computing, Umm Alqura University)

투고 : 2024.02.05
발행 : 2024.02.29

https://doi.org/10.22937/IJCSNS.2024.24.2.1 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

While text-to-image models have made remarkable progress in image synthesis, certain models, particularly generative diffusion models, have exhibited a noticeable bias to- wards generating images related to the culture of some developing countries. This paper introduces an empirical investigation aimed at mitigating the bias of image generative model. We achieve this by incorporating symbols representing Saudi culture into a stable diffusion model using the Dreambooth technique. CLIP score metric is used to assess the outcomes in this study. This paper also explores the impact of varying parameters for instance the quantity of training images and the learning rate. The findings reveal a substantial reduction in bias-related concerns and propose an innovative metric for evaluating cultural relevance.

키워드

과제정보

Dr. Alharbi would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (23UQU43101400DSR005). She also would like to express her gratitude for support this research ID:4401095348.

참고문헌

D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning representations by back- propagating errors, Nature 323 (1986) 533-536. https://doi.org/10.1038/323533a0
H. L. Jungang Xu and S. Zhou, An overview of deep generative models, IETE Tech- nical Review 32(2) (2015) 131-139. https://doi.org/10.1080/02564602.2014.987328
A. Razavi, A. Van den Oord and O. Vinyals, Generating diverse high-fidelity images with vq-vae-2, Advances in neural information processing systems 32 (2019).
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., Language models are few-shot learners, Advances in neural information processing systems 33 (2020) 1877-1901.
P. Dhariwal, H. Jun, C. Payne, J. W. Kim, A. Radford and I. Sutskever, Jukebox: A generative model for music, arXiv preprint arXiv:2005.00341 (2020).
M. Kumar, M. Babaeizadeh, D. Erhan, C. Finn, S. Levine, L. Dinh and D. Kingma, Videoflow: A flow-based generative model for video, arXiv preprint arXiv:1903.01434 2(5) (2019) p. 3.
T. Marwah, G. Mittal and V. N. Balasubramanian, Attentive semantic video genera-tion using captions, in Proceedings of the IEEE international conference on computer vision2017, pp. 1426-1434.
E. I. Nikolaev, Opportunities and challenges in deep generative models, in CEUR Workshop Proceedings2018, pp. 326-329.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, Generative adversarial networks, Communications of the ACM 63(11) (2020) 139-144. https://doi.org/10.1145/3422622
M. Elasri, O. Elharrouss, S. Al-ma'adeed and H. Tairi, Image generation: A review, Neural Processing Letters 54 (03 2022).
J. Ho, A. Jain and P. Abbeel, Denoising diffusion probabilistic models, Advances in neural information processing systems 33 (2020) 6840-6851.
P. Dhariwal and A. Nichol, Diffusion models beat gans on image synthesis, Advances in neural information processing systems 34 (2021) 8780-8794.
R. Rombach, A. Blattmann, D. Lorenz, P. Esser and B. Ommer, High-resolution image synthesis with latent diffusion models, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)June 2022, pp. 10684-10695.

International Journal of Computer Science & Network Security

Injection of Cultural-based Subjects into Stable Diffusion Image Generative Model

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)