1 |
L. Yann, B Yoshua, and G. Hinton, "Deep Learning," Nature, Vol. 521, No. 7553, pp. 436-444, 2015.
DOI
|
2 |
A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Neural Information Processing Systems, Vol. 1, No. 6, pp. 1097-1105, 2012.
|
3 |
Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, and K. Yu, "Large-Scale Image Classification: Fast Feature Extraction and SVM Training," Proceeding of 2011 IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1689-1696, 2011.
|
4 |
J. Sanchez, F. Perronnin, and T. Mensink, "Improving the Fisher Kernel for Large-Scale Image Classification," Proceedings of the 11th European Conference on Computer Vision, pp. 143-156, 2010.
|
5 |
Y. Lecun, L. Bottou, and Y. Bengio, "Gradient-Based Learning Applied to Document Recognition," Proceeding of The IEEE, Vol. 86, No. 11, pp. 2278-2324, 1998.
|
6 |
K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Computer Vision and Pattern Recognition, arXiv:1512.03385, 2015.
|
7 |
S. Zagoruyko and N. Komodakis, "Wide Residual Networks," Computer Vision and Pattern Recognition, arXiv: 1605.07146, 2016.
|
8 |
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, et al., "Caffe: An Open Source Convolutional Architecture for Fast Feature Embedding," Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675-678, 2014.
|
9 |
X. Glorot and Y. Bengio, "Understanding the Difficulty of Training Deep Feedforward Neural Networks," Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 249-256, 2010.
|
10 |
V. Nair and G. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," Proceedings of the 27th International Conference on Machine Learning, pp. 807-814, 2010.
|
11 |
K. He, X. Zhang, S. Ren, and J. Sun, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," Proceedings of the International Conference on Computer Vision, pp. 1026-1034, 2015.
|
12 |
L. Bottou, "Stochastic Gradient Descent Tricks," Neural Network, Tricks of the Trade, Reloaded, Vol. 7700, pp. 430-445, 2012.
|
13 |
J. Duchi, E. Hazan, and Y. Singer, "Adaptive Subgradient Methods for Online Leaning and Stochastic Optimization," Proceeding of International Conference on Learning Theory, pp. 2121-2159, 2010.
|
14 |
T. Tieleman and G. Hinton, RMSProp: Divide the Gradient by a Running Average of Its Recent Magnitude, COURSERA: Neural Networks for Machine Learning Technical Report, 2012.
|
15 |
I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the Importance of Initialization and Momentum in Deep Learning," Proceeding of the 30th International Conference on Machine Learning, Vol. 28, pp. 1139-1147, 2013.
|
16 |
N. Qian, "On the Momentum Term in Gradient Descent Learning Algorithms," Neural Networks: The Official Journal of the International Neural Network Society, Vol. 12, No. 1, pp. 145-151, 1999.
DOI
|
17 |
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. reed, D. Anguelov, et al., "Going Deeper with Convolutions," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
|
18 |
D.P. Kingma and L.J. Ba, "Adam: A Method for Stochastic Optimization," Proceeding of International Conference on Learning Representations, arXiv:1412.6980, 2015.
|
19 |
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, et al., "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252, 2015.
DOI
|
20 |
Caffe, http://caffe.berkeleyvision.org/tutorial/solver.html (accessed Jun., 20, 2014).
|
21 |
G. Huang, Z. Liu, L. v. d. Maaten and K. Q. Weinberger, "Densely Connected Convolution al Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261-2269, 2017.
|
22 |
Y. Jeong, l. Ansari, J. Shim, and J. Lee, "A Car Plate Area Detection System Using Deep Convolution Neural Network", Journal of Korea Multimedia Society, Vol. 20, No. 8, pp. 1166-1174, 2017.
DOI
|
23 |
M. Zeiler, "ADADELTA: An Adaptive Learning Rate Method," Learning, arXiv: 1212.5701, 2012.
|