[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7465/jkdi.2017.28.6.1217

A study on complexity of deep learning model

Kim, Dongha (Department of Statistics, Seoul National University)
Baek, Gyuseung (Department of Statistics, Seoul National University)
Kim, Yongdai (Department of Statistics, Seoul National University)

Publication Information

Journal of the Korean Data and Information Science Society / v.28, no.6, 2017 , pp. 1217-1227 More about this Journal

Abstract

Deep learning has been studied explosively and has achieved excellent performance in areas like image and speech recognition, the application areas in which computations have been challenges with ordinary machine learning techniques. The theoretical study of deep learning has also been researched toward improving the performance. In this paper, we try to find a key of the success of the deep learning in rich and efficient expressiveness of the deep learning function, and analyze the theoretical studies related to it.

Keywords

Complexity; deep learning; deep neural network; linear regions; trajectory of a function; transition of a function;

Citations & Related Records

Times Cited By KSCI : 3 (Citation Analysis)

Reference
Cited By KSCI

1	Larochelle, H., Erhan, D., Courville, A., Bergstra, J. and Bengio, Y. (2007). An empirical evaluation of deep architectures on problems with many factors of variation. Proceedings of the 24th International Conference on Machine Learning, 473-480.
2	LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324. DOI
3	Lee, K. J., Lee, H. J. and Oh, K. J. (2015). Using fuzzy-neural network to predict hedge fund survival. Journal of the Korean Data & Information Science Society, 26, 1189-1198. DOI
4	Lee, W. (2017). A deep learning analysis of the KOSPI’s directions. Journal of the Korean Data & Information Science Society, 28, 287-295. DOI
5	Lee, W. and Chun, H. (2016). A deep learning analysis of the Chinese Yuan’s volatility in the onshore and offshore markets. Journal of the Korean Data & Information Science Society, 27, 327-335. DOI
6	Maas, A. L., Hannun, A. Y. and Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. Proceedings of the 30th International Conference on Machine Learning, 30.
7	Mikolov, T., Karafiat, M., Burget, L., Cernock'y, J. and Khudanpur, S. (2010). Recurrent neural network based language model. Interspeech, 2.
8	Miotto, R., Wang, F., Wang, S., Jiang, X. and Dudley, J. T. (2017). Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics.
9	Montufar, G. F., Pascanu, R., Cho, K. and Bengio, Y. (2014). On the number of linear regions of deep neural networks. Advances in Neural Information Processing Systems, 2924-2932.
10	Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning, 807-814.
11	Chung, J., Gulcehre, C., Cho, K. and Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
12	Clevert, D., Unterthiner, T. and Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289.
13	Cybenko G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems (MCSS), 2, 303-314. DOI
14	Eldan, R. and Shamir, O. (2016). The power of depth for feedforward neural networks. Conference on Learning Theory, 907-940.
15	Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. Colorado University at Boulder Department of Computer Science.
16	Oord, A., and Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A. and Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
17	Pascanu, R., Montufar, G. and Bengio, Y. (2013). On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv preprint arXiv:1312.6098.
18	Raghu, M., Poole, B., Kleinberg, J., Ganguli, S. and Sohl-Dickstein, J. (2016). On the expressive power of deep neural networks. arXiv preprint arXiv:1606.05336.
19	Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. and others. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484-489. DOI
20	Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A. and others. (2017). Mastering the game of go without human knowledge. Nature, 550, 354-359. DOI
21	Sutskever, I., Vinyals, O and Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, 3104-3112.
22	Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A. (2015). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9.
23	Tieleman, T. and Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. Coursera: Neural Networks for Machine Learning, 4.
24	Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
25	Zeiler, M. D. (2012). ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
26	He, K., Zhang, X., Ren, S. and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision, 1026-1034.
27	He, K., Zhang, X., Ren, S. and Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
28	Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504-507. DOI
29	Hinton, G. E., Osindero, S. and Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554. DOI
30	Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9, 1735-1780. DOI
31	Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 1097-1105.
32	Hornik, K., Stinchcombe, Ma. and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359-366. DOI
33	Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
34	Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

KSCI

A study on complexity of deep learning model 딥러닝 모형의 복잡도에 관한 연구

A study on complexity of deep learning model