References
- Y. LeCun, Y. Bengio, and G. Hinton. "Deep learning," Nature Vol. 521, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
- T. Mikolov et al., "Recurrent neural network based language model," Interspeech, Vol. 2, 2010.
- G. Hinton, et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 82-97, 2012. https://doi.org/10.1109/MSP.2012.2205597
- G. E. Hinton, "Deep belief networks," Scholarpedia, Vol. 4, No. 5, 2009.
- S. Becker and Y. Le Cun, "Improving the convergence of back-propagation learning with second order methods," in Proc. of the 1988 connectionist models summer school. 1988.
- Y. Nesterov, "A method of solving a convex programming problem with convergence rate O (1/k2)," Soviet Mathematics Doklady. Vol. 27. No. 2. 1983.
- J. Duchi, E. Hazan and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, pp. 2121-2159, 12 Jul. 2011.
- T. Tieleman and G. Hinton, "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude," COURSERA: Neural networks for machine learning 4.2, pp. 26-31, 2012.
- D. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv, 1412.6980, 2014.
- D. W. Hosmer Jr, S. Lemeshow and R. X. Sturdivant, Applied logistic regression. Vol. 398. John Wiley & Sons, 2013.
- D. W. Ruck, et al., "The multilayer perceptron as an approximation to a Bayes optimal discriminant function," IEEE Transactions on Neural Networks, Vol. 1, No. 4, pp. 296-298, 1990. https://doi.org/10.1109/72.80266
- M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, Vol. 45, No. 11, pp. 2673-2681, 1997. https://doi.org/10.1109/78.650093
- A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, 2012.
- N. Srivastava, et al., "Dropout: a simple way to prevent neural networks from overfitting," Journal of machine learning research, Vol. 15, No. 1, pp. 1929-1958, 2014.