1 |
Smith, Craig S, "The Man Who Helped Turn Toronto into a High-Tech Hotbed". The New York Times. Retrieved 27 June 2017.
|
2 |
J. Liang, E. Meyerson, and R. Miikkulainen. Evolutionary architecture search for deep multitask networks. arXiv preprint arXiv:1803.03745, 2018.
|
3 |
J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. Le, M. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Ng, "Large scale distributed deep net works," in NIPS, 2012.
|
4 |
T. Schaul, S. Zhang, and Y. LeCun, "No more pesky learning rates," arXiv:1206.1106, 2012.
|
5 |
N. Jaitly, P. Nguyen, A. Senior, and V. Vanhoucke, "Application of pretrained deep neural networks to large vocabulary speech recognition," in Interspeech, 2012.
|
6 |
G. Morse and K. O. Stanley. Simple evolutionary optimization can rival stochastic gradient descent in neural networks. In The Genetic and Evolutionary Computation Conference (GECCO), pages 477-484, 2016.
|
7 |
J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," in COLT, 2010.
|
8 |
S. Becker and Y. LeCun, "Improving the convergence of back-propagation learning with second order methods," Tech. Rep., Department of Computer Science, University of Toronto, Toronto, ON, Canada, 1988.
|