Browse > Article
http://dx.doi.org/10.3837/tiis.2020.04.016

A Hierarchical deep model for food classification from photographs  

Yang, Heekyung (Dept. of Computer Science, Graduate School, Sangmyung Univ.)
Kang, Sungyong (Dept. of Computer Science, Sangmyung Univ.)
Park, Chanung (Dept. of Computer Science, Sangmyung Univ.)
Lee, JeongWook (Dept. of Computer Science, Sangmyung Univ.)
Yu, Kyungmin (Dept. of Computer Science, Sangmyung Univ.)
Min, Kyungha (Dept. of Computer Science, Sangmyung Univ.)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.4, 2020 , pp. 1704-1720 More about this Journal
Abstract
Recognizing food from photographs presents many applications for machine learning, computer vision and dietetics, etc. Recent progress of deep learning techniques accelerates the recognition of food in a great scale. We build a hierarchical structure composed of deep CNN to recognize and classify food from photographs. We build a dataset for Korean food of 18 classes, which are further categorized in 4 major classes. Our hierarchical recognizer classifies foods into four major classes in the first step. Each food in the major classes is further classified into the exact class in the second step. We employ DenseNet structure for the baseline of our recognizer. The hierarchical structure provides higher accuracy and F1 score than those from the single-structured recognizer.
Keywords
CNN; DenseNet; classification; food; dietetics;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Chen, M., Dhingra, K., Wu, W., Yang, L., Sukthankar, R., and Yang, J., "PFID: Pittsburgh fast-food image dataset," in Proc. of IEEE International Conference on Image Processing, pp.289-292, 2009.
2 Joutou, T. and Yanai, K., "A food image recognition system with multiple kernel learning," in Proc. of IEEE International Conference on Image Processing, pp.285-288, 2009.
3 Wu, W. and Yang, J., "Fast food recognition from videos of eating for calorie estimation," in Proc. of IEEE International Conference on Multimedia and Expo, pp.1210-1213, 2009.
4 Yang, S., Chen, M., Pomerleau, D., and Sukthankar, R., "Food recognition using statistics of pairwise local features," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp.2249-2256, 2010.
5 Bosch, M., "Combining global and local features for food identification in dietary assessment," in Proc. of IEEE International Conference on Image Processing, pp.1789-1792, 2011.
6 Chen, M., "Automatic chinese food identification and quantity estimation," in Proc. of ACM Siggraph Asia, pp.1-4, 2012.
7 Matsuda, Y., Hoashi, H. and Yanai, K., "Recognition of multiple-food images by detecting candidate regions," in Proc. of IEEE International Conference on Multimedia and Expo, pp.25-30, 2012.
8 Bossard, L., Guillaumin, M. and van Gool, L., "Food-101-mining discriminative components with random forests," in Proc. of European Conference on Computer Vision, pp.446-461, 2014.
9 Kagaya, H., Aizawa, K and Ogawa, M., "Food detection and recognition using convolutional neural network," in Proc. of ACM International Conference on Multimedia, pp.1085-1088, 2014.
10 Kawano, Y. and Yanai, K., "Food image recognition with deep convolutional features," in Proc. of ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp.589-593, 2014.
11 Kawano, Y. and Yanai, K., "Automatic expansion of a food image dataset leveraging existing categories with domain adaptation," in Proc. of European Conference on Computer Vision Workshops, pp.3-17, 2014.
12 Kagaya, H. and Aizawa, K., "Highly accurate food/non-food image classification based on a deep convolutional neural network," in Proc. of International Conference on Image Analysis and Processing, pp.350-357, 2015.
13 Yanai, K. and Kawano, Y., "Food image recognition using deep convolutional network with pre-training and fine-tuning," in Proc. of IEEE International Conference on Multimedia and Expo, pp.1-6, 2015.
14 Pouladzadeh, P., Yassine, A. and Shirmohammadi, S., "Foodd: food detection dataset for calorie measurement using food images," in Proc. of International Conference on Image Analysis and Processing, pp.441-448, 2015.
15 Akbari Fard, M., Hadadi, H. and Tavakoli Targhi, A., "Fruits and vegetables calorie counter using convolutional neural networks," in Proc. of ACM International Conference on Digital Health, pp.121-122, 2016.
16 Singla, A. and Yuan, L. and Ebrahimi, T., "Food/non-food image classification and food categorization using pre-trained GoogLeNet model," in Proc. of ACM International Workshop on Multimedia Assisted Dietary Management, pp.3-11, 2016.
17 Tatsuma, A and Aono, M., "Food image recognition using covariance of convolutional layer feature maps," IEICE TRANSACTIONS on Information and Systems, 99(6), 1711-1715, 2016.   DOI
18 Hassannejad, H., "Food image recognition using very deep convolutional networks," in Proc. of ACM International Workshop on Multimedia Assisted Dietary Management, pp.41-49, 2016.
19 Jain, N. K. and Khanna, S. O. and Chetna, M., "Feed Forward Neural Network Classification for INDIAN Krishna Kamod Rice," International Journal of Computer Applications, 134(14), pp. 38-42, 2016.
20 Ragusa, F., "Food vs non-food classification," in Proc. of ACM International Workshop on Multimedia Assisted Dietary Management, pp.77-81, 2016.
21 Farooq, M. and Sazonov, E., "Feature extraction using deep learning for food type recognition," in Proc. of International Conference on Bioinformatics and Biomedical Engineering, pp.464-472, 2017.
22 Simonyan, K. and Zisserman, A., "Very deep convolutional networks for large-scale image recognition," arXiv print arXiv:1409.1556, 2014.
23 Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. and Torralba, A., "Learning deep features for discriminative localization," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.2921-2929, 2016.
24 Krizhevsky, A., Sutskever, I. and Hinton, G., Imagenet classification with deep convolutional neural networks," in Proc. of Advances in Neural Information Processing Systems, pp.1097-1105, 2012.
25 Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A., "Going deeper with convolutions," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp.1-9, 2015.
26 He, K., Zhang, X., Ren, S. and Sun, J., "Deep residual learning for image recognition," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp.770-778, 2016.
27 Huang, G., Liu, Z., van der Maaten, L. and Weinberger, K. Q., "Densely connected convolutional networks," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp.4700-4708, 2016.
28 Song, J., Gao, L., Nie, F., Shen, H. T., Yan, Y., and Sebe, N., "Optimized graph learning using partial tags and multiple features for image and video annotation," IEEE Transactions on Image Processing, Vol. 25, No. 11, pp. 4999-5011, 2016.   DOI
29 Wang, X., Gao, L., Wang, P., Sun, X., Liu, X., "Two-stream 3-D convNet fusion for action recognition in videos with arbitrary size and length," IEEE Transactions on Multimedia, Vol. 20, No. 3, pp. 634-644, 2018.   DOI
30 Song, J., Guo, Y., Gao, L., Li, X., Hanjalic, A., and Shen, H. T., "From deterministic to generative: Multi-modal stochastic RNNs for video captioning," IEEE Transactions on Neural Networks and Learning Systems, Vol. 30, No. 10, pp. 3047-3058, 2019.   DOI