[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7471/ikeee.2020.24.4.969

Comparison of Hyper-Parameter Optimization Methods for Deep Neural Networks

Kim, Ho-Chan (Dept. of Electrical Eng., Jeju National Univ.)
Kang, Min-Jae (Dept. of Electrical Eng., Jeju National Univ.)

Publication Information

Journal of IKEEE / v.24, no.4, 2020 , pp. 969-974 More about this Journal

Abstract

Research into hyper parameter optimization (HPO) has recently revived with interest in models containing many hyper parameters, such as deep neural networks. In this paper, we introduce the most widely used HPO methods, such as grid search, random search, and Bayesian optimization, and investigate their characteristics through experiments. The MNIST data set is used to compare results in experiments to find the best method that can be used to achieve higher accuracy in a relatively short time simulation. The learning rate and weight decay have been chosen for this experiment because these are the commonly used parameters in this kind of experiment.

Keywords

hyper parameter optimization; deep neural networks; grid search; random search; Bayesian optimization;

Citations & Related Records

Reference

1	M. Feurer and F. Hutter, "Hyperparameter optimization," in F. Hutter, L. Kotthoff, and J. Vanschoren (Eds.), Automated Machine Learning, pp.3-33, Springer, 2019.
2	J. Bergstra, D. Yamins, and D. D. Cox, "Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures," in Proc. of the 30th International Conference on Machine Learning, vol.28, pp.115-123, 2013. DOI: 10.5555/3042817.3042832 DOI
3	J. Bergstra and Y. Bengio, "Random search for hyper-parameter optimization," Journal of machine learning research, vol.13, pp.281-305, 2012. DOI: 10.5555/2188385.2188395 DOI
4	J. Snoek, H. Larochelle, and R. P. Adams, "Practical bayesian optimization of machine learning algorithms," Advances in Neural Information Processing Systems, vol.25, pp.2951-2959, 2012.
5	Brochu, E., Cora, M., and de Freitas, N. "A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning," In TR-2009-23, UBC, 2009.
6	J. Wang, J. Xu, and X. Wang, "Combination of hyperband and bayesian optimization for hyperparameter optimization in deep learning," arXiv preprint arXiv:1801.01596, 2018. https://arxiv.org/abs/1801.01596
7	S. Falkner, A. Klein, and F. Hutter, "Bohb: Robust and efficient hyperparameter optimization at scale," Proceedings of Machine Learning Research, vol.80, pp.1437-1446, 2018.
8	C. Harrington, "Practical guide to hyperparameters optimization for deep learning models," Deep Learning, 2018. https://blog.floydhub.com/guide-to-hyperparameters-search-for-deep-learning-models/
9	Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient based learning applied to document recognition," Proceedings of the IEEE, vol.86, no.11, pp.2278-2324, 1998.