[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2019.11.009

Bagging deep convolutional autoencoders trained with a mixture of real data and GAN-generated data

Hu, Cong (School of Internet of Things Engineering, Jiangnan University)
Wu, Xiao-Jun (School of Internet of Things Engineering, Jiangnan University)
Shu, Zhen-Qiu (School of Internet of Things Engineering, Jiangnan University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.11, 2019 , pp. 5427-5445 More about this Journal

Abstract

While deep neural networks have achieved remarkable performance in representation learning, a huge amount of labeled training data are usually required by supervised deep models such as convolutional neural networks. In this paper, we propose a new representation learning method, namely generative adversarial networks (GAN) based bagging deep convolutional autoencoders (GAN-BDCAE), which can map data to diverse hierarchical representations in an unsupervised fashion. To boost the size of training data, to train deep model and to aggregate diverse learning machines are the three principal avenues towards increasing the capabilities of representation learning of neural networks. We focus on combining those three techniques. To this aim, we adopt GAN for realistic unlabeled sample generation and bagging deep convolutional autoencoders (BDCAE) for robust feature learning. The proposed method improves the discriminative ability of learned feature embedding for solving subsequent pattern recognition problems. We evaluate our approach on three standard benchmarks and demonstrate the superiority of the proposed method compared to traditional unsupervised learning methods.

Keywords

representation learning; unsupervised learning; generative adversarial networks; deep convolutional autoencoders; bagging;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Lu, X., Yuan, Y., & Yan, P., "Image super-resolution via double sparsity regularized manifold learning," IEEE transactions on circuits and systems for video technology, 23(12), 2022-2033, 2013. DOI
2	Bourlard, H., & Kamp, Y., "Auto-association by multilayer perceptrons and singular value decomposition," Biological cybernetics, 59(4-5), 291-294, 1988. DOI
3	Bengio, Y., "Learning deep architectures for AI," Foundations and trends(R) in Machine Learning, 2(1), 1-127, 2009. DOI
4	Shin, H. C., Orton, M. R., Collins, D. J., Doran, S. J., & Leach, M. O., "Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data," IEEE transactions on pattern analysis and machine intelligence, 35(8), 1930-1943, 2012. DOI
5	Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P. A., "Extracting and composing robust features with denoising autoencoders," in Proc. of the 25th international conference on Machine learning, pp. 1096-1103, July 2008.
6	Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P. A., "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," Journal of machine learning research, 11(Dec), 3371-3408, 2010.
7	Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J., "Stacked convolutional auto-encoders for hierarchical feature extraction," in Proc. of International Conference on Artificial Neural Networks, pp. 52-59, June 2011.
8	Radford, A., Metz, L., & Chintala, S., "Unsupervised representation learning with deep convolutional generative adversarial networks," arXiv preprint arXiv:1511.06434, 2015.
9	Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L., "Imagenet: A large-scale hierarchical image database," in Proc. of Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 248-255, June 2009.
10	Ha, K., Cho, S., & MacLachlan, D., "Response models based on bagging neural networks," Journal of Interactive Marketing, 19(1), 17-30, 2005. DOI
11	Mordelet, F., & Vert, J. P., "A bagging SVM to learn from positive and unlabeled examples," Pattern Recognition Letters, 37, 201-209, 2014. DOI
12	Shu, Z., Zhao, C., & Huang, P., "Constrained Sparse Concept Coding algorithm with application to image representation," KSII Transactions on Internet and Information Systems (TIIS), 8(9), 3211-3230, 2014. DOI
13	Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y., "Generative adversarial nets," in Proc. of the Advances in neural information processing systems, vol. 2, pp. 2672-2680, 2014.
14	Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X., "Improved techniques for training gans," in Proc. of the Advances in Neural Information Processing Systems, pp. 2234-2242, 2016.
15	Breiman, L., "Bagging predictors," Machine learning, 24(2), 123-140, 1996. DOI
16	Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P., "Infogan: Interpretable representation learning by information maximizing generative adversarial nets," in Proc. of the Advances in neural information processing systems, pp. 2172-2180, 2016.
17	Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A., "Context encoders: Feature learning by inpainting," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536-2544, 2016.
18	Yeh, R., Chen, C., Lim, T. Y., Hasegawa-Johnson, M., & Do, M. N., "Semantic image inpainting with perceptual and contextual losses," arXiv preprint, arXiv preprint arXiv:1607.07539, 2, 2016.
19	Hu, C., & Wu, X. J., "Autoencoders with Drop Strategy," in Proc. of International Conference on Brain Inspired Cognitive Systems, pp. 80-89, November 2016.
20	LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, 86(11), 2278-2324, 1998. DOI
21	Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y., "Reading digits in natural images with unsupervised feature learning," in Proc. of NIPS workshop on deep learning and unsupervised feature learning, Vol. 2011, No. 2, p. 5, December 2011.
22	Krizhevsky, A., & Hinton, G., "Learning multiple layers of features from tiny images," Technical report, University of Toronto, Vol. 1, No. 4, p. 7, 2009.
23	Coates, A., & Ng, A. Y., "Selecting receptive fields in deep networks," in Proc. of Advances in Neural Information Processing Systems, pp. 2528-2536, 2011.
24	Lee, H., Grosse, R., Ranganath, R., & Ng, A. Y., "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations," in Proc. of the 26th annual international conference on machine learning, pp. 609-616, June 2009.
25	Hosseini-Asl, E., Zurada, J. M., & Nasraoui, O., "Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints," IEEE transactions on neural networks and learning systems, 27(12), 2486-2498, 2016. DOI
26	Makhzani, A., & Frey, B., "K-sparse autoencoders," arXiv preprint arXiv:1312.5663, 2013.
27	Makhzani, A., & Frey, B. J., "Winner-take-all autoencoders," Advances in Neural Information Processing Systems, pp. 2791-2799, 2015.
28	LeCun, Y., Bengio, Y., & Hinton, G., "Deep learning," nature, 521(7553), 436, 2015. DOI
29	Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H. G., & Ogata, T., "Audio-visual speech recognition using deep learning," Applied Intelligence, 42(4), 722-737, 2015. DOI
30	Wu, G., Lu, W., Gao, G., Zhao, C., & Liu, J., "Regional deep learning model for visual tracking," Neurocomputing, 175, 310-323, 2016. DOI
31	Zhuang, F., Cheng, X., Luo, P., Pan, S. J., & He, Q., "Supervised Representation Learning with Double Encoding-Layer Autoencoder for Transfer Learning," ACM Transactions on Intelligent Systems and Technology (TIST), 9(2), 16, 2017.
32	Feng, Z. H., Kittler, J., Awais, M., Huber, P., & Wu, X. J., "Wing Loss for Robust Facial Landmark Localisation With Convolutional Neural Networks," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2235-2245, 2018.
33	Zeiler, M. D., & Fergus, R. "Visualizing and understanding convolutional networks," in Proc. of European conference on computer vision, pp. 818-833, September 2014.
34	Hu, C., Wu, X. J., & Shu, Z. Q., "Discriminative Feature Learning via Sparse Autoencoders with Label Consistency Constraints," Neural Processing Letters, vol. 50(2), pp. 1079-1091, 2019. DOI
35	Hu, C., Wu, X. J., & Kittler, J., "Semi-supervised learning based on GAN with mean and variance feature matching," IEEE Transactions on Cognitive and Developmental Systems, 2018.
36	Yang, J., Yu, K., Gong, Y., & Huang, T., "Linear spatial pyramid matching using sparse coding for image classification," in Proc. of Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1794-1801, June 2009.
37	Lazebnik, S., Schmid, C., & Ponce, J, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Proc. of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 2, pp. 2169-2178, 2006.
38	Li, H., Wei, Y., Li, L., & Chen, C. P., "Hierarchical feature extraction with local neural response for image recognition," IEEE transactions on cybernetics, 43(2), 412-424, 2013. DOI
39	LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, 86(11), 2278-2324, 1998. DOI
40	Schmidhuber, J., "Deep learning in neural networks: An overview," Neural networks, 61, 85-117, 2015. DOI
41	Yuan, Y., Mou, L., & Lu, X., "Scene recognition by manifold regularized deep learning architecture," IEEE transactions on neural networks and learning systems, 26(10), 2222-2233, 2015. DOI