1 |
Y. LeCun, Y. Bengio, and G. Hinton. "Deep learning," Nature Vol. 521, pp. 436-444, 2015.
DOI
|
2 |
T. Mikolov et al., "Recurrent neural network based language model," Interspeech, Vol. 2, 2010.
|
3 |
G. Hinton, et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 82-97, 2012.
DOI
|
4 |
G. E. Hinton, "Deep belief networks," Scholarpedia, Vol. 4, No. 5, 2009.
|
5 |
S. Becker and Y. Le Cun, "Improving the convergence of back-propagation learning with second order methods," in Proc. of the 1988 connectionist models summer school. 1988.
|
6 |
Y. Nesterov, "A method of solving a convex programming problem with convergence rate O (1/k2)," Soviet Mathematics Doklady. Vol. 27. No. 2. 1983.
|
7 |
J. Duchi, E. Hazan and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, pp. 2121-2159, 12 Jul. 2011.
|
8 |
T. Tieleman and G. Hinton, "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude," COURSERA: Neural networks for machine learning 4.2, pp. 26-31, 2012.
|
9 |
D. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv, 1412.6980, 2014.
|
10 |
D. W. Hosmer Jr, S. Lemeshow and R. X. Sturdivant, Applied logistic regression. Vol. 398. John Wiley & Sons, 2013.
|
11 |
D. W. Ruck, et al., "The multilayer perceptron as an approximation to a Bayes optimal discriminant function," IEEE Transactions on Neural Networks, Vol. 1, No. 4, pp. 296-298, 1990.
DOI
|
12 |
M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, Vol. 45, No. 11, pp. 2673-2681, 1997.
DOI
|
13 |
A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, 2012.
|
14 |
N. Srivastava, et al., "Dropout: a simple way to prevent neural networks from overfitting," Journal of machine learning research, Vol. 15, No. 1, pp. 1929-1958, 2014.
|