DOI QR코드

DOI QR Code

Finding the best suited autoencoder for reducing model complexity

  • Received : 2021.06.23
  • Accepted : 2021.07.22
  • Published : 2021.09.30

Abstract

Basically, machine learning models use input data to produce results. Sometimes, the input data is too complicated for the models to learn useful patterns. Therefore, feature engineering is a crucial data preprocessing step for constructing a proper feature set to improve the performance of such models. One of the most efficient methods for automating feature engineering is the autoencoder, which transforms the data from its original space into a latent space. However certain factors, including the datasets, the machine learning models, and the number of dimensions of the latent space (denoted by k), should be carefully considered when using the autoencoder. In this study, we design a framework to compare two data preprocessing approaches: with and without autoencoder and to observe the impact of these factors on autoencoder. We then conduct experiments using autoencoders with classifiers on popular datasets. The empirical results provide a perspective regarding the best suited autoencoder for these factors.

Keywords

Acknowledgement

This work was supported by Korea Institute of Science and Technology Information (KISTI).

References

  1. Y. Bengio, A. Courville and P. Vincent, "Representation learning: A review and new perspectives," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, p. 1798-1828, Aug., 2013 https://doi.org/10.1109/TPAMI.2013.50
  2. B. Ghojogh, M. N. Samad, S. A. Mashhadi, T. Kapoor, W. Ali, et al., "Feature selection and feature extraction in pattern analysis: A literature review," arXiv preprint arXiv:1905.02845, May, 2019
  3. M. Tschannen, O. Bachem and M. Lucic, "Recent advances in autoencoder-based representation learning," arXiv preprint arXiv:1812.05069, Dec., 2018
  4. I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, MIT Press, 2016
  5. P. Vincent, H. Larochelle, Y. Bengio and P.-A. Manzagol, "Extracting and composing robust features with denoising autoencoders," Proceedings of the 25th International Conference on Machine Learning, pp. 1096-1103, Helsinki, Finland, Jul., 2008
  6. K. Sohn, X. Yan and H. Lee, "Learning Structured Output Representation Using Deep Conditional Generative Models," Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 2, pp. 3483-3491, Dec., 2015
  7. J. Walker, C. Doersch, A. Gupta and M. Hebert, "An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders," eprint arXiv:1606.07873, Jun., 2016.
  8. T.-V. Dang, H.-T. Vo, G.-H. Yu, J.-H. Lee, H.-T. Nguyen and J.-Y. Kim, "Removing Out-Of-Distribution Samples on Classification Task," Smart Media Journal, vol. 9, no. 3, pp. 80-89, Sep., 2020 https://doi.org/10.30693/SMJ.2020.9.3.80
  9. D. P. Kingma and M. Welling, "Auto-encoding variational bayes," arXiv preprint arXiv:1312.6114, Dec., 2013
  10. C. Doersch, "Tutorial on variational autoencoders," arXiv preprint arXiv:1606.05908, Jun., 2016
  11. H. Kim and A. Mnih, "Disentangling by Factorising," Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 2649-2658, Stockholmsmassan, Sweden, Jul., 2018
  12. R. T. Q. Chen, X. Li, R. B. Grosse and D. K. Duvenaud, "Isolating Sources of Disentanglement in Variational Autoencoders," NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2615-2625, Dec., 2018
  13. S. Zhao, J. Song and S. Ermon, "Infovae: Information maximizing variational autoencoders," arXiv preprint arXiv:1706.02262, Jun., 2017
  14. A. Gretton, K. Borgwardt, M. J. Rasch, B. Scholkopf and A. J. Smola, "A Kernel Method for the Two-Sample-Problem," Advances in Neural Information Processing Systems, vol. 19, pp. 513-520, Dec., 2006.
  15. A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow and B. Frey, "Adversarial autoencoders," arXiv preprint arXiv:1511.05644, Nov., 2015.
  16. B. Gayathri and C. Sumathi, "An automated technique using Gaussian Naive Bayes classifier to classify breast cancer," International Journal of Computer Applications, vol. 148, no. 6, pp. 16-21, 2016
  17. C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, no. 3, pp. 273-297, Sep., 1995. https://doi.org/10.1007/BF00994018
  18. G. James, D. Witten, T. Hastie and R. Tibshirani, An introduction to statistical learning, Springer, 2013
  19. L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5-32, Oct.., 2001 https://doi.org/10.1023/A:1010933404324
  20. Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov., 1998 https://doi.org/10.1109/5.726791
  21. D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," 3rd International Conference on Learning Representations(ICLR), 2015
  22. Y. LeCun, C. Cortes and C. Burges, "MNIST handwritten digit database," ATT Labs, 2010
  23. H. Xiao, K. Rasul and R. Vollgraf, "Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms," CoRR, 2017
  24. A. Krizhevsky, "Learning multiple layers of features from tiny images," Apr., 2009.
  25. Y. Bengio, E. Thibodeau-Laufer, G. Alain and J. Yosinski, "Deep Generative Stochastic Networks Trainable by Backprop," ICML'14: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32, pp. 226-34, Jun., 2014
  26. T. Salimans, D. P. Kingma and M. Welling, "Markov Chain Monte Carlo and Variational Inference: Bridging the Gap," ICML'15: Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 1218-1226, Jul., 2015
  27. T. D. Kulkarni, W. F. Whitney, P. Kohli and J. Tenenbaum, "Deep Convolutional Inverse Graphics Network," NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 2, pp. 2539-2547, Dec., 2015
  28. D. J. Rezende, S. Mohamed and D. Wierstra, "Stochastic Backpropagation and Approximate Inference in Deep Generative Models," Proceedings of the 31st International Conference on Machine Learning, vol. 32, no. 2, pp. 1278-1286, Jun., 2014
  29. K. Gregor, I. Danihelka, A. Graves, D. Rezende and D. Wierstra, "DRAW: A Recurrent Neural Network For Image Generation," Proceedings of Machine Learning Research, vol. 37, pp. 1462-1471, 2015.
  30. D. P. Kingma, D. J. Rezende, S. Mohamed and M. Welling, "Semi-Supervised Learning with Deep Generative Models," NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 2, pp. 3581-3589, Dec., 2014
  31. S. Pant, J. Kim and S. Lee, "A Fall Detection Technique using Features from Multiple Sliding Windows," Smart Media Journal, vol. 7, no. 4, pp. 79-89, Dep., 2018 https://doi.org/10.30693/SMJ.2018.7.4.79
  32. T. D. Vu, H.-J. Yang, L. N. Do and T. N. Thieu, "Classifying Instantaneous Cognitive States from fMRI using Discriminant based Feature Selection and Adaboost," Smart Media Journal, vol. 5, no. 1, pp. 30-37, Jan., 2016