[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7472/jksii.2019.20.6.65

A layered-wise data augmenting algorithm for small sampling data

Cho, Hee-chan (Graduate School of Information Security, Korea University)
Moon, Jong-sub (Graduate School of Information Security, Korea University)

Publication Information

Journal of Internet Computing and Services / v.20, no.6, 2019 , pp. 65-72 More about this Journal

Abstract

Data augmentation is a method that increases the amount of data through various algorithms based on a small amount of sample data. When machine learning and deep learning techniques are used to solve real-world problems, there is often a lack of data sets. The lack of data is at greater risk of underfitting and overfitting, in addition to the poor reflection of the characteristics of the set of data when learning a model. Thus, in this paper, through the layer-wise data augmenting method at each layer of deep neural network, the proposed method produces augmented data that is substantially meaningful and shows that the method presented by the paper through experimentation is effective in the learning of the model by measuring whether the method presented by the paper improves classification accuracy.

Keywords

Deep learning; data augmentation; Eigen decomposition;

Citations & Related Records

Reference

1	Y. LeCun, Y. Bengio, A. Courville, and G. Hinton, "Deep Learning," Cambridge: MIT Press, 2016.
2	C. Rich, S. Lawrence, and C. -L. Giles, "Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping," Advances in neural information processing systems, 2001. https://papers.nips.cc/paper/1895-overfitting-in-neural-nets -backpropagation-conjugate-gradient-and-early-stopping. pdf
3	G. E. Hinton, S. Osindero and Y. W. Teh, "A fast learning algorithm for deep belief nets," Neural computation, 18(7), pp.1527-1554, 2006. https://doi.org/10.1162/neco.2006.18.7.1527 DOI
4	N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, andR. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," Journal of Machine Learning Research, Vol.15, No.1, pp.1929-1958, 2014. http://jmlr.org/papers/v15/srivastava14a.html
5	C. Shorten and T. M. Khoshgoftaar, "A survey on Image Data Augmentation for Deep Learning," Journal of Big Data, 6(1), 60, 2019. http://doi.org/10.1186/s40537-019-0197-0 DOI
6	N. V. Chawla, L. O. Hall, K. W. Bowyer, and W. P. Kegelmeyer, "Smote: Synthetic minority oversampling technique," Journal of Artificial Intelligence Research, Vol.16, pp.321-357, 2002. DOI
7	N. S. Altman, "An introduction to kernel and nearestneighbor nonparametric regression," The American Statistician, 46(3), pp.175-185, 1992. http://doi.org/10.1080/00031305.1992.10475879 DOI
8	S. Hu, Y. Liang, L. Ma and Y. He, "MSMOTE: improving classification performance when training data is imbalanced," 2009 second international workshop on computer science and engineering, Vol.2, pp.13-17, 2009.
9	Lim, J. S., Oh, Y. S., & Lim, D. H, "Bagging support vector machine for improving breast cancer classification," J Health Info Stat, 39(1), pp.15-24. 2014. https://e-jhis.org/journal/view.php?number=426
10	H. Cao, X.-L. Li, D.-K. Woon, and S.-K. Ng, "Integrated oversampling for imbalanced time series classification," IEEE Trans. Knowl. Data Eng., vol. 25, no. 12, pp.2809-282, Dec. 2013. https://doi.org/10.1109/TKDE.2013.37 DOI
11	G. E. Hinton and R. R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks," Science, Vol. 313, no. 5786, pp.504-507, 2006. https://doi.org/10.1126/science.1127647 DOI
12	S. Wold, K. Esbensen, and P. Geladi, "Principal component analysis," Chemometrics and intelligent laboratory systems, 2(1-3), pp.37-52. 1987. DOI
13	H. Abdi, "The eigen-decomposition: Eigenvalues and eigenvectors," Encyclopedia of measurement and statistics, pp.304-308, 2007. https://personal.utdallas.edu/-herve/Abdi-EVD2007-pretty.pdf
14	P. Vincent, H. Larochelle, Y. Bengio and P. A. Manzagol, "Extracting and composing robust features with denoising autoencoders," In Proceedings of the 25th international conference on Machine learning, pp.1096-1103, 2008.
15	S. Hochreiter, The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(02), pp.107-116, 1998. DOI
16	Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy Layer-Wise Training of Deep Networks," Adv. Neural Inf. Process. Syst., Vol. 19, no. 1, pp. 153-160, 2007.
17	Finney, D. John, "Probit analysis: a statistical treatment of the sigmoid response curve," Cambridge university press, Cambridge, 1952.
18	UCI Machine Learning Repository. University of California, Center for Machine Learning and Intelligent Systems. Available at https://archive.ics.uci.edu/ml/datasets.php

KSCI

A layered-wise data augmenting algorithm for small sampling data 적은 양의 데이터에 적용 가능한 계층별 데이터 증강 알고리즘

A layered-wise data augmenting algorithm for small sampling data