[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7236/IJASC.2017.6.4.10

Improvement of the Convergence Rate of Deep Learning by Using Scaling Method

Ho, Jiacang (Department of Ubiquitous IT, Graduate School, Dongseo University)
Kang, Dae-Ki (Department of Computer Engineering, Dongseo University)

Publication Information

International journal of advanced smart convergence / v.6, no.4, 2017 , pp. 67-72 More about this Journal

Abstract

Deep learning neural network becomes very popular nowadays due to the reason that it can learn a very complex dataset such as the image dataset. Although deep learning neural network can produce high accuracy on the image dataset, it needs a lot of time to reach the convergence stage. To solve the issue, we have proposed a scaling method to improve the neural network to achieve the convergence stage in a shorter time than the original method. From the result, we can observe that our algorithm has higher performance than the other previous work.

Keywords

Deep learning; neural network; convergence rate; scaling method;

Citations & Related Records

Reference

1	Y. LeCun, Y. Bengio, and G. Hinton. "Deep learning," Nature Vol. 521, pp. 436-444, 2015. DOI
2	T. Mikolov et al., "Recurrent neural network based language model," Interspeech, Vol. 2, 2010.
3	G. Hinton, et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 82-97, 2012. DOI
4	G. E. Hinton, "Deep belief networks," Scholarpedia, Vol. 4, No. 5, 2009.
5	S. Becker and Y. Le Cun, "Improving the convergence of back-propagation learning with second order methods," in Proc. of the 1988 connectionist models summer school. 1988.
6	Y. Nesterov, "A method of solving a convex programming problem with convergence rate O (1/k2)," Soviet Mathematics Doklady. Vol. 27. No. 2. 1983.
7	J. Duchi, E. Hazan and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, pp. 2121-2159, 12 Jul. 2011.
8	T. Tieleman and G. Hinton, "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude," COURSERA: Neural networks for machine learning 4.2, pp. 26-31, 2012.
9	D. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv, 1412.6980, 2014.
10	D. W. Hosmer Jr, S. Lemeshow and R. X. Sturdivant, Applied logistic regression. Vol. 398. John Wiley & Sons, 2013.
11	D. W. Ruck, et al., "The multilayer perceptron as an approximation to a Bayes optimal discriminant function," IEEE Transactions on Neural Networks, Vol. 1, No. 4, pp. 296-298, 1990. DOI
12	M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, Vol. 45, No. 11, pp. 2673-2681, 1997. DOI
13	A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, 2012.
14	N. Srivastava, et al., "Dropout: a simple way to prevent neural networks from overfitting," Journal of machine learning research, Vol. 15, No. 1, pp. 1929-1958, 2014.