DOI QR코드

DOI QR Code

Food Detection by Fine-Tuning Pre-trained Convolutional Neural Network Using Noisy Labels

  • 투고 : 2021.07.05
  • 발행 : 2021.07.30

초록

Deep learning is an advanced technology for large-scale data analysis, with numerous promising cases like image processing, object detection and significantly more. It becomes customarily to use transfer learning and fine-tune a pre-trained CNN model for most image recognition tasks. Having people taking photos and tag themselves provides a valuable resource of in-data. However, these tags and labels might be noisy as people who annotate these images might not be experts. This paper aims to explore the impact of noisy labels on fine-tuning pre-trained CNN models. Such effect is measured on a food recognition task using Food101 as a benchmark. Four pre-trained CNN models are included in this study: InceptionV3, VGG19, MobileNetV2 and DenseNet121. Symmetric label noise will be added with different ratios. In all cases, models based on DenseNet121 outperformed the other models. When noisy labels were introduced to the data, the performance of all models degraded almost linearly with the amount of added noise.

키워드

참고문헌

  1. D. de Ridder, F. Kroese, C. Evers, M. Adriaanse, and M. Gillebaart, "Healthy diet: Health impact, prevalence, correlates, and interventions," Psychology & health,vol. 32, no. 8, pp. 907-941, 2017. https://doi.org/10.1080/08870446.2017.1316849
  2. S. Mezgec and B. Korousic Seljak, "Nutrinet: a deep learning food and drink image recognition system for dietary assessment," Nutrients, vol. 9, no. 7, p. 657, 201
  3. J. Schmidhuber, "Deep learning in neural networks: An overview," Neural networks, vol. 61, pp. 85-117, 2015. https://doi.org/10.1016/j.neunet.2014.09.003
  4. D. H. Hubel and T. N. Wiesel, "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex," The Journal of physiology, vol. 160, no. 1, pp. 106-154, 1962. https://doi.org/10.1113/jphysiol.1962.sp006837
  5. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "Overfeat: Integrated recognition, localisation and detection using convolutional networks," arXiv preprint arXiv:1312.6229, 2013.
  6. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed,C.-Y. Fu, and A. C. Berg, "Ssd: Single shot multibox detector," in European conference on computer vision.Springer, 2016, pp. 21-37.
  7. L. Zhou, C. Zhang, F. Liu, Z. Qiu, and Y. He, "Application of deep learning in food: a review," Comprehensive reviews in food science and food safety, vol. 18, no. 6, pp. 1793-1811, 2019. https://doi.org/10.1111/1541-4337.12492
  8. H. Kagaya, K. Aizawa, and M. Ogawa, "Food detection and recognition using convolutional neural network," in Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 1085-1088.
  9. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248-255.
  10. E. J. Heravi, H. H. Aghdam, and D. Puig, "An optimised convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods," Pattern Recognition Letters, vol. 105, pp. 50-58, 2018. https://doi.org/10.1016/j.patrec.2017.12.007
  11. H. Wu, M. Merler, R. Uceda-Sosa, and J. R. Smith, "Learning to make better mistakes: Semantics-aware visual food recognition," in Proceedings of the 24th ACM international conference on Multimedia, 2016, pp. 172-176.
  12. P. Pandey, A. Deepthi, B. Mandal, and N. B. Puhan, "Foodnet: Recognising foods using an ensemble of deep networks," IEEE Signal Processing Letters, vol. 24, no. 12, pp. 1758-1762, 2017. https://doi.org/10.1109/LSP.2017.2758862
  13. C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, andY. Ma, "Deepfood: Deep learning-based food image recognition for computer-aided dietary assessment," in International Conference on Smart Homes and HealthTelematics. Springer, 2016, pp. 37-48.
  14. C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, M. Yunsheng, S. Chen, and P. Hou, "A new deep learning-based food recognition system for dietary assessment on an edge computing service infrastructure," IEEE Transactions on Services Computing, vol. 11, no. 2, pp. 249-261,2017. https://doi.org/10.1109/tsc.2017.2662008
  15. A. Ramdani, A. Virgono, and C. Setianingsih, "Fooddetection with image processing using convolutional neural network (CNN) method," in 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT). IEEE, 2020, pp.91-96.
  16. J. Zheng, L. Zou, and Z. J. Wang, "Mid-level deep food part mining for food image recognition," IET ComputerVision, vol. 12, no. 3, pp. 298-304, 2018.
  17. H. Hassannejad, G. Matrella, P. Ciampolini, I. De Mu-nari, M. Mordonini, and S. Cagnoni, "Food image recognition using very deep convolutional networks," in Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, 2016, pp. 41-49.
  18. N. Martinel, G. L. Foresti, and C. Micheloni, "Wide-slice residual networks for food recognition," in 2018 IEEEWinter Conference on applications of computer vision (WACV). IEEE, 2018, pp. 567-576.
  19. K. Yanai and Y. Kawano, "Food image recognition using deep convolutional network with pre-training and fine-tuning," in2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 2015, pp. 1-6.
  20. Z. Fu, D. Chen, and H. Li, "Chinfood1000: A large benchmark dataset for Chinese food recognition," International Conference on Intelligent Computing. Springer, 2017, pp. 273-281.
  21. G. Ciocca, P. Napoletano, and R. Schettini, "Cnn-based features for retrieval and classification of food images," Computer Vision and Image Understanding, vol. 176, pp.70-77, 2018. https://doi.org/10.1016/j.cviu.2018.09.001
  22. J. Li, T. Dai, Q. Tang, Y. Xing, and S.-T. Xia, "Cyclic annealing training convolutional neural networks for image classification with noisy labels," in2018 25th IEEE International Conference on Image Processing (ICIP).IEEE, 2018, pp. 21-25.
  23. E. Arazo, D. Ortego, P. Albert, N. O'Connor, and K. McGuinness, "Unsupervised label noise modelling and loss correction," international Conference on MachineLearning. PMLR, 2019, pp. 312-321.
  24. D. Rolnick, A. Veit, S. Belongie, and N. Shavit, "Deep learning is robust to massive label noise," arXiv preprintarXiv:1705.10694, 2017.
  25. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed,D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
  26. M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L. Chen, "Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation," CoRR, vol. abs/1801.04381,2018.[Online]. Available: http://arxiv.org/abs/1801.04381
  27. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXivpreprint arXiv:1409.1556, 2014.
  28. G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.
  29. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, andZ. Wojna, "Rethinking the inception architecture for computer vision," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp.2818-2826.
  30. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko,W. Wang, T. Wey and, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXivpreprintarXiv:1704.04861, 2017.
  31. L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 - mining discriminative components with random forests," in European Conference on Computer Vision,2014.
  32. F. Cholletet al., "Keras," https://keras.io, 2015.
  33. T. Kluyver, B. Ragan-Kelley, F. P'erez, B. E. Granger, M. Bussonnier, J. Frederic, K. Kelley, J. B. Hamrick,J. Grout, S. Corlayet al., Jupyter Notebooks-a publishing format for reproducible computational workflows., 2016, vol. 2016.
  34. P. Chen, B. B. Liao, G. Chen, and S. Zhang, "Under-standing and utilising deep neural networks trained with noisy labels," in International Conference on MachineLearning. PMLR, 2019, pp. 1062-107