DOI QR코드

DOI QR Code

Learning T.P.O Inference Model of Fashion Outfit Using LDAM Loss in Class Imbalance

LDAM 손실 함수를 활용한 클래스 불균형 상황에서의 옷차림 T.P.O 추론 모델 학습

  • Park, Jonghyuk (Division of Industrial Engineering, Seoul National University)
  • 박종혁 (서울대학교 산업공학과)
  • Received : 2021.02.01
  • Accepted : 2021.03.20
  • Published : 2021.03.28

Abstract

When a person wears clothing, it is important to configure an outfit appropriate to the intended occasion. Therefore, T.P.O(Time, Place, Occasion) of the outfit is considered in various fashion recommendation systems based on artificial intelligence. However, there are few studies that directly infer the T.P.O from outfit images, as the nature of the problem causes multi-label and class imbalance problems, which makes model training challenging. Therefore, in this study, we propose a model that can infer the T.P.O of outfit images by employing a label-distribution-aware margin(LDAM) loss function. Datasets for the model training and evaluation were collected from fashion shopping malls. As a result of measuring performance, it was confirmed that the proposed model showed balanced performance in all T.P.O classes compared to baselines.

의복을 착용하는데 있어 목적 상황에 부합하는 옷차림을 구성하는 것은 중요하다. 따라서 인공지능 기반의 다양한 패션 추천 시스템에서 의복 착용의 T.P.O(Time, Place, Occasion)를 고려하고 있다. 하지만 옷차림으로부터 직접 T.P.O를 추론하는 연구는 많지 않은데, 이는 문제 특성 상 다중 레이블 및 클래스 불균형 문제가 발생하여 모델 학습을 어렵게 하기 때문이다. 이에 본 연구에서는 label-distribution-aware margin(LDAM) loss를 도입하여 옷차림의 T.P.O를 추론할 수 있는 모델을 제안한다. 모델의 학습 및 평가를 위한 데이터셋은 패션 쇼핑몰로부터 수집되었고 이를 바탕으로 성능을 측정한 결과, 제안 모델은 비교 모델 대비 모든 T.P.O 클래스에서 균형잡힌 성능을 보여주는 것을 확인할 수 있었다.

Keywords

References

  1. K. J. Tschu. (2007. Feb). Kleidungssignale als nonverbale Kommunikationsmittel. Cogito, 61, 243-260.
  2. M. J. Lee & I. S. Lee. (2012. Jan). Party Wear Industry Conditions in Korea and the Analysis of Dress Style According to Party Types. Journal of the Korean Society of Clothing and Textiles, 36(1), 12-26. https://doi.org/10.5850/JKSCT.2012.36.1.12
  3. Y. LeCun et al. (1990. Nov). Handwritten Digit Recognition with a Back-propagation Network. In Proceedings of the Advances in neural information processing systems (pp. 396-404).
  4. H. J. Kim, S. H. Lee, H. H. Han & J. S. Kim. (2020. Dec). Saliency Attention Method for Salient Object Detection Based on Deep Learning. Journal of The Korea Convergence Society, 11(12), 39-47. https://doi.org/10.15207/JKCS.2020.11.12.039
  5. D. W. Lee, S. H. Lee & H. H. Han. (2020. Dec). Deep Learning-based Super Resolution Method Using Combination of Channel Attention and Spatial Attention. Journal of The Korea Convergence Society, 11(12), 15-22. https://doi.org/10.15207/JKCS.2020.11.12.015
  6. S. H. Sung, K. B. Lee & S. H. Park. (2020. Jun). Research on Korea Text Recognition in Images Using Deep Learning. Journal of The Korea Convergence Society, 11(6), 1-6. https://doi.org/10.15207/JKCS.2020.11.6.001
  7. D. H. Cho, Y. W. Nam, H. C. Lee & Y. H. Kim. (2019. Sep). Image Mood Classification Using Deep CNN and Its Application to Automatic Video Generation. Journal of The Korea Convergence Society, 10(9), 23-29.
  8. K. Cao, C. Wei, A. Gaidon, N. Arechiga & T. Ma. (2019. Dec). Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss. In Preceedings of Advances in Neural Information Processing Systems (pp. 1567-1758).
  9. K. He, X. Zhang, S. Ren & J. Sun. (2016. Jun). Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770-778).
  10. Y. Cui, M. Jia, T. Y. Lin, Y. Song & S. Belongie. (2019. Jun). Class-balanced loss based on effective number of samples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9268-9277).
  11. Z. Liu, P. Luo, S. Qiu, X. Wang & X. Tang. (2016. Jun). DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1096-1104).
  12. W. Wang, Y. Xu, J. Shen & S. C. Zhu (2018. Jun). Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4271-4280).
  13. Y. Miao, G. Li, C. Bao, J. Zhang & J. Wang. (2020) ClothingNet: Cross-domain Clothing Retrieval with Feature Fusion and Quadruplet Loss. IEEE Access, 8, 142669-142679. https://doi.org/10.1109/access.2020.3013631
  14. M. Jia et al. (2020. Aug). Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset. In Proceedings of the European Conference on Computer Vision (pp. 316-332).
  15. R. Miyamoto, T. Nakajima & T. Oki. (2019. May). Accurate Fashion Style Estimation with a Novel Training Set and Removal of Unnecessary Pixels. In Proceedings of the IEEE International Symposium on Circuits and Systems (pp. 1-5).
  16. D. Verma, K. Gulati, V. Goel & R. R. Shah. (2020. Oct). Fashionist: Personalising Outfit Recommendation for Cold-Start Scenarios. In Proceedings of the ACM International Conference on Multimedia (pp. 4527-4529).
  17. Y. Ma, X. Yang, L. Liao, Y. Cao & T. S. Chua. (2019. Oct). Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media. In Proceedings of the ACM International Conference on Multimedia (pp. 257-265).
  18. A. Graves & J. Schmidhuber. (2005, Jul-Aug). Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Networks, 18(5-6), 602-610. https://doi.org/10.1016/j.neunet.2005.06.042
  19. M. Takagi, E. Simo-Serra, S. Iizuka & H. Ishikawa, (2017. Oct). What Makes a Style: Experimental Analysis of Fashion Prediction. In Proceedings of the IEEE International Conference on Computer Vision Workshops (pp. 2247-2253).
  20. T. Y. Lin, P. Goyal, R. Girshick, K. He & P. Dollar (2017. Oct). Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2980-2988).
  21. B. Li, Y. Liu & X. Wang (2019. Jan). Gradient Harmonized Single-stage Detector. In Proceedings of AAAI Conference on Artificial Intelligence (pp. 8577-8584).
  22. V. Nair & G. E. Hinton. (2010. Jun). Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the International Conference on Machine Learning (pp. 807-814).
  23. C. Ferri, J. Hernandex-Orallo & R. Modroiu. (2009) An experimental comparison of performance measures for classification. Pattern Recognition Letters, 30(1), 27-38. https://doi.org/10.1016/j.patrec.2008.08.010
  24. R. Alejo, J. A. Antonio, R. M. Valdovinos & J. H. Pacheco-Sanchez. (2013. Jun). Assessments metrics for multi-class imbalance learning: A preliminary study. in Proceedings of Mexican Conference of Pattern Recognition (pp. 335-343).
  25. J. Wang et al. (2020. Apr). Deep High-Resolution Representation Learning for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-1.
  26. J. Y. Zhu, T. Park, P. Isola & A. A. Efros. (2017. Oct). Unpaired Image-to-Image Translation using CycleConsistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2223-2232).