Browse > Article
http://dx.doi.org/10.5573/ieie.2017.54.5.35

Performance Analysis of Hint-KD Training Approach for the Teacher-Student Framework Using Deep Residual Networks  

Bae, Ji-Hoon (KSB Convergence Research Department, Electronics and Telecommunications Research Institute)
Yim, Junho (Department of Electrical Engineering, Korea Advanced Institute of Science and Technology)
Yu, Jaehak (KSB Convergence Research Department, Electronics and Telecommunications Research Institute)
Kim, Kwihoon (KSB Convergence Research Department, Electronics and Telecommunications Research Institute)
Kim, Junmo (Department of Electrical Engineering, Korea Advanced Institute of Science and Technology)
Publication Information
Journal of the Institute of Electronics and Information Engineers / v.54, no.5, 2017 , pp. 35-41 More about this Journal
Abstract
In this paper, we analyze the performance of the recently introduced Hint-knowledge distillation (KD) training approach based on the teacher-student framework for knowledge distillation and knowledge transfer. As a deep neural network (DNN) considered in this paper, the deep residual network (ResNet), which is currently regarded as the latest DNN, is used for the teacher-student framework. Therefore, when implementing the Hint-KD training, we investigate the impact on the weight of KD information based on the soften factor in terms of classification accuracy using the widely used open deep learning frameworks, Caffe. As a results, it can be seen that the recognition accuracy of the student model is improved when the fixed value of the KD information is maintained rather than the gradual decrease of the KD information during training.
Keywords
Knowledge distillation; Hint training; Deep residual networks; Caffe;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y.-T. Park, "A comparative study of image recognition by neural network classifier and linear tree classifier", Journal of The Institute of Electronics and Information Engineers-B, vol. 31, no. 5, pp. 141-148, 1994.
2 S. Hong, W. Im, J. Park, and H.-S. Yang, "Deep CNN-based person identification using facial and clothing features", in Proc. of Summer Conference on Institute of Electronics and Information Engineers (IEIE), pp. 2204-2207, June, 2016.
3 Y. Shin, J.-H. Park, S. Shin, G. Lim, S. Song, C. Lee, and J.-M. Chung, "Improvement of image classification in augmented reality based on deep learning", in Proc. of Summer Conference on Institute of Electronics and Information Engineers (IEIE), pp. 1771-1773, June, 2016.
4 G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural netwok", arXiv prreprint arXiv:1503.02531, pp. 1-19, 2015.
5 A. Romero, N. Ballas, S.E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, "Fitnets: Hints for thin deep nets", in Proc. of 5th International Conference on Learning Representations (ICLR), pp. 1-13, San Diego, May 7-9, 2015.
6 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition", in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-12, Las Vegas, June 26-July 1, 2016.
7 A. Veit, M. Wilber, and S. Belongie, "Residual networks are exponential ensembles of relatively shallow networks", arXivpreprint arXiv:1605.06431, pp. 1-12, 2016.
8 I.J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, "Maxout networks", arXiv:1302.4389, pp. 1-9, 2013.
9 K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition", in Proc. of 5th International Conference on Learning Representations (ICLR), pp. 1-14, San Diego, May 7-9, 2015.
10 [Online available] "Caffe, deep learning frame work", http://caffe.berkeleyvision.org/
11 [Online available] "CIFAR-10 and CIFAR-100 datasets", https://www.cs.toronto.edu/-kriz/cifar.html
12 [Online available] "MNIST dataset", http://yann.lecun.com/exdb/mnist