Browse > Article
http://dx.doi.org/10.13064/KSSS.2016.8.2.023

Implementation of CNN in the view of mini-batch DNN training for efficient second order optimization  

Song, Hwa Jeon (한국전자통신연구원)
Jung, Ho Young (한국전자통신연구원 음성처리연구실)
Park, Jeon Gue (한국전자통신연구원 음성처리연구실)
Publication Information
Phonetics and Speech Sciences / v.8, no.2, 2016 , pp. 23-30 More about this Journal
Abstract
This paper describes some implementation schemes of CNN in view of mini-batch DNN training for efficient second order optimization. This uses same procedure updating parameters of DNN to train parameters of CNN by simply arranging an input image as a sequence of local patches, which is actually equivalent with mini-batch DNN training. Through this conversion, second order optimization providing higher performance can be simply conducted to train the parameters of CNN. In both results of image recognition on MNIST DB and syllable automatic speech recognition, our proposed scheme for CNN implementation shows better performance than one based on DNN.
Keywords
automatic speech recognition; DNN; CNN; second order optimization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Abdel-Hamid, O., Mohamed, A., Jiang, H., Deng, L., Penn, G., & Yu, D. (2014). Convolutional neural networks for speech recognition. IEEE/ACM Transactions on Audio, Speech, And Language Processing, 22(10), 1533-1545.   DOI
2 Sak, H., Senior, A., & Beaufays, F. (2014). Long short-term recurrent neural network architectures for large scale acoustic modeling. Interspeech 2014 (pp. 338-342).
3 Chellapilla, K., Puri, S., & Simard, P. (2006). High performance convolutional neural networks for document processing. Proceedings of International Workshop on Frontiers in Handwriting Recognition.
4 Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM International Conference on Multimedia (pp. 675-678).
5 Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., & Tran, J. (2014). cuDNN: efficient primitives for deep learning. Retrieved from http://arxiv.org/abs/1410.0759 [Computing Research Repository] on April 15, 2016.
6 Ren, J. & Xu, L. (2015). On vectorization of deep convolutional neural networks for vision tasks, Proceedings of the 29th AAAI Conference on Artificial Intelligence (pp. 1840-1846).
7 Song, H. J., Jung, H. Y., & Park, J. G. (2015). A study of CNN training based on various filter structures and feature normalization methods. Proceedings 2015 International Conference on Speech Sciences (pp. 243-244).
8 Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.   DOI
9 Amari, S. (1998). Natural gradient works efficiently in learning. Neural Computation, 10, 251-276.   DOI
10 Povey, D., Zhang, X., & Khudanpur, S. (2015). Parallel training of DNNs with natural gradient and parameter averaging. Proceedings of International Conference on Learning Representations 2015.
11 Song, H. J., Jung, H. Y., & Park, J. G. (2015). A study of DNN training based on various pretraining approaches. Proceedings of the 2015 Spring Conference of the Korean Society of Speech Sciences (pp. 169-170). (송화전.정호영.박전규 (2015). 다양한 Pretraining 방법에 따른 DNN 훈련 방법에 대한 고찰. 한국음성학회 2015 봄학술대회 논문집, 169-170.)
12 Rodrigo Benenson. (2013-2016). MNIST. Retrieved from http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html on Apil 15, 2016.
13 Google. (2015). Tensorflow. Retrieved from https://www.tensorflow.org/ on April 15, 2016.