1 |
X. Glorot and Y. Bengio, "Understanding the Difficulty of Training Deep Feedforward Neural Networks", In Proc. AISTATS. Society for Artificial Intelligence and Statistics, 2010.
|
2 |
K. He, X. Zhang, S. Ren and J, Sun, "Delving Deep into Rectifiers: Surpassing Human-level Performance on Imagenet Classification", In Proceedings of the IEEE International Conference on Computer Vision, 2015.
|
3 |
G. Klambauer, T. Unterthiner, A. Mayr and S. Hochreiter, "Self-Normalizing Neural Networks", CoRR, 2017, abs/1706.02515.
|
4 |
N. Qian, "On the Momentum Term in Gradient Descent Learning Algorithms", Neural Networks : The Official Journal of The International Neural Network Society, Vol. 12, No. 1, pp. 145-151, 1999.
DOI
|
5 |
D. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization", International Conference on Learning Representations, pp. 1-13, 2015.
|
6 |
G. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Improving Neural Networks by Preventing Coadaptation of Feature Detectors," 2012, http://arxiv.org/abs/1207.0580.
|