Browse > Article
http://dx.doi.org/10.3745/KTSDE.2017.6.12.573

A Deep Neural Network Model Based on a Mutation Operator  

Jeon, Seung Ho (고려대학교 정보보호학과)
Moon, Jong Sub (고려대학교 전자 및 정보공학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.6, no.12, 2017 , pp. 573-580 More about this Journal
Abstract
Deep Neural Network (DNN) is a large layered neural network which is consisted of a number of layers of non-linear units. Deep Learning which represented as DNN has been applied very successfully in various applications. However, many issues in DNN have been identified through past researches. Among these issues, generalization is the most well-known problem. A Recent study, Dropout, successfully addressed this problem. Also, Dropout plays a role as noise, and so it helps to learn robust feature during learning in DNN such as Denoising AutoEncoder. However, because of a large computations required in Dropout, training takes a lot of time. Since Dropout keeps changing an inter-layer representation during the training session, the learning rates should be small, which makes training time longer. In this paper, using mutation operation, we reduce computation and improve generalization performance compared with Dropout. Also, we experimented proposed method to compare with Dropout method and showed that our method is superior to the Dropout one.
Keywords
Deep Learning; Generalization; Denoising; Mutation Operation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. LeCun, Y. Bengio, and G. Hinton, "Deep Learning," Nature, Vol.521, pp.436-444, 2015.   DOI
2 E. Levin, N. Tishby, and S. Solla, "A statistical approach to learning and generalization in layered neural networks," Proceedings of the IEEE, Vol.78, No.10, pp.1568-1574, 1990   DOI
3 C. M. Bishop, "Neural Network for Pattern Recognition," Oxford University Press, pp.332-380, 1995.
4 N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," Journal of Machine Learning Research, Vol.15, No.1, pp.1929-1958, 2014.
5 S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," In International Conference on Machine Learning, pp.448-456, 2015.
6 M. Mitchell, "Genetic Algorithms: An Overview," Complexity, Vol.1, No.1, pp.31-39, 1995.   DOI
7 L. K. Hansen and P Salamon, "Neural Network Ensembles," IEEE Trans, Pattern Analysis and Machine Intelligence, Vol. 12, No.10, pp.993-1001, 1990.   DOI
8 P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, "Extracting and Composing Robust Features with Denoising Autoencoders," In International Conference on Machine Learning, pp.1096-1103, 2008.
9 P. Vincent, H. Larochelle, L. Lajoie, Y. Bengio, and P. A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," In International Conference on Machine Learning, pp.3371-3408, 2010.
10 R. S. Sutton and A. G. Barto, "Reinforcement Learning: An Introduction," Cambridge: MIT Press, pp.25-42, 2012.
11 D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
12 L. Bottou, "Stochastic gradient learning in neural networks," Proceedings of Neuro-Nimes, Vol.91, No.8, 1991.
13 Tieleman, Tijmen and G. Hinton, "Lecture 6.5: RMSProp-Divide the gradient by a running average of its recent magnitude," In COURSERA: Neural Networks for Machine Learning, 2012.
14 Y. LeCun, C. Cortes, and C. J. C. Burges, "The Mnist Database of handwritten digits" [Internet], http://yann.lecun.com/exdb/mnist/.
15 X. Glorot, A. Bordes, and Y. Bengio, "Deep Sparse Rectified Neural Networks," In Conference on Artificial Intelligence and Statistics, 2011.