DOI QR코드

DOI QR Code

Effects of Hyper-parameters and Dataset on CNN Training

  • Nguyen, Huu Nhan (Dept. of Electronic Engineering, Soongsil University) ;
  • Lee, Chanho (Dept. of Electronic Engineering, Soongsil University)
  • 투고 : 2018.02.20
  • 심사 : 2018.03.29
  • 발행 : 2018.03.31

초록

The purpose of training a convolutional neural network (CNN) is to obtain weight factors that give high classification accuracies. The initial values of hyper-parameters affect the training results, and it is important to train a CNN with a suitable hyper-parameter set of a learning rate, a batch size, the initialization of weight factors, and an optimizer. We investigate the effects of a single hyper-parameter while others are fixed in order to obtain a hyper-parameter set that gives higher classification accuracies and requires shorter training time using a proposed VGG-like CNN for training since the VGG is widely used. The CNN is trained for four datasets of CIFAR10, CIFAR100, GTSRB and DSDL-DB. The effects of the normalization and the data transformation for datasets are also investigated, and a training scheme using merged datasets is proposed.

키워드

참고문헌

  1. K. Alex, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Proc. of Neural Information Processing Systems, pp. 1097-1105, 2012.
  2. Y. Bengio, "Practical recommendations for gradient-based training of deep architecture," Neural Networks: Tricks of the Trade, Springer Berlin Heidelberg, pp. 437-478, 2012.
  3. T. M. Breuel, "The Effects of Hyperparameters on SGD Training of Neural Networks," https://arxiv.org/abs/1508.02788
  4. N. Ketkar, Deep learning with Python, Apress, 2017.
  5. M. Moller, "Supervised learning on large redundant training sets," in Proc. of Neural Networks for Signal Processing, pp. 79-89, 1992.
  6. M.D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," in Proc. of 13th European Conference on Computer Vision, pp. 818-833, 2014.
  7. Y. Xu, R. Jia and L. Mou, G. Li, Y. Chen, Y. Lu, and Z. Jin, "Improved relation classification by deep recurrent neural networks with data augmentation," in Proc. of 26th International Conference on Computational Linguistics (COLING), pp. 1461-1470, 2016.
  8. A. Rusiecki, M. Kordos, T. Kaminski, and K. Gren, "Training Neural Networks on Noisy Data," in Proc. of 13th ICAISC, pp. 131-142, 2014.