DOI QR코드

DOI QR Code

A Study of MultiModal Foundation Model in Fashion Recommendation

패션 추천에서 멀티모달 파운데이션 모델에 관한 연구

  • Dere Roshidat Oluwabukola (Dept. of Artificial Intelligence Convergence, Chonnam National University) ;
  • Kyungbeak Kim (Dept. of Artificial Intelligence Convergence, Chonnam National University)
  • 데레 로시다트 올루와부콜라 (전남대학교 인공지능융합학과) ;
  • 김경백 (전남대학교 인공지능융합학과)
  • Published : 2024.10.31

Abstract

Influenced by societal trends, cultural standards, and individual personalitiees, fashion is a potent means of self-expression. Many industries have benefited from the advancement of Artificial Intelligence(AI), with the fashion industry emerging as one of the most notable. AI has assisted the fashion industry in a number of areas, including product design and marketing. Online buying has proliferated as the fashion business has expanded into a multibillion-dollar industry, offering customers easy, stress-free shopping experiences. By advising customers on what to buy there could be potential increase in the sales of such and other products. The goal of this study is to investigate qualitatively mutimodal foundation models for fashion critics and advice. In this paper, we adapted a Gemini 1.5 flash on our dataset for compatibility prediction and complementary commentary on clothing. Qualitatively, the model provided very indepth review with varying images while also criticing fashion combination that are not compabible. The study alludes to the robotuness of mutimodal models with reommendation on quantitative evaluation in future studies.

Keywords

Acknowledgement

This work was supported by Innovative Human Resource Development for Local Intellectualization program through the Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT)(IITP-2024-RS-2022-00156287, 50%). This work was supported by Institute of Information & Communications Technology Planning & Evaluation(IITP) under the Artificial Intelligence Convergence Innovation Human Resources Development(IITP-2023-RS-2023-00256629, 50%) grant funded by the Korea government (MSIT).

References

  1. Girard, A. (2024). History and Evolution of Fashion and Design in Different Regions and Periods in France. International Journal of Fashion and Design, 3(1), 49-59.
  2. Song, X., Feng, F., Liu, J., Li, Z., Nie, L., & Ma, J. (2017, October). Neurostylist: Neural compatibility modeling for clothing matching. In Proceedings of the 25th ACM international conference on Multimedia (pp. 753-761). 
  3. Shirkhani, S., Mokayed, H., Saini, R., & Chai, H. Y. (2023). Study of AI-Driven Fashion Recommender Systems. SN Computer Science, 4(5), 514. 
  4. Lin, Y. L., Tran, S., & Davis, L. S. (2020). Fashion outfit complementary item retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3311-3319). 
  5. Han, X., Yu, L., Zhu, X., Zhang, L., Song, Y. Z., & Xiang, T. (2022, October). Fashionvil: Fashion-focused vision-and-language representation learning. In European conference on computer vision (pp. 634-651). Cham: Springer Nature Switzerland. 
  6. Jing, P., Cui, K., Guan, W., Nie, L., & Su, Y. (2023). Category-aware multimodal attention network for fashion compatibility modeling. IEEE Transactions on Multimedia, 25, 9120-9131. 
  7. Aggarwal, P. (2019). Fashion product images dataset. Retrieved from kaggle: https://www.kaggle.com/paramaggarwal/fashion-product-images-dataset. 
  8. Reid, M., Savinov, N., Teplyashin, D., Lepikhin, D., Lillicrap, T., Alayrac, J. B., ... & Mustafa, B. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530.