DOI QR코드

DOI QR Code

Recent Research & Development Trends in Automated Machine Learning

자동 기계학습(AutoML) 기술 동향

  • Published : 2019.08.01

Abstract

The performance of machine learning algorithms significantly depends on how a configuration of hyperparameters is identified and how a neural network architecture is designed. However, this requires expert knowledge of relevant task domains and a prohibitive computation time. To optimize these two processes using minimal effort, many studies have investigated automated machine learning in recent years. This paper reviews the conventional random, grid, and Bayesian methods for hyperparameter optimization (HPO) and addresses its recent approaches, which speeds up the identification of the best set of hyperparameters. We further investigate existing neural architecture search (NAS) techniques based on evolutionary algorithms, reinforcement learning, and gradient derivatives and analyze their theoretical characteristics and performance results. Moreover, future research directions and challenges in HPO and NAS are described.

Keywords

Acknowledgement

Grant : 부하분산과 능동적 적시 대응을 위한 빅데이터 엣지 분석 기술 개발

Supported by : 정보통신기획평가원

References

  1. J. Bergstra, Y. Bengio, "Random search for hyper-parameter optimization," J. Mach. Learning Research, vol. 13, Feb. 2012, pp. 281-305.
  2. E. Brochu, V.M. Cora, N. de Freitas, "A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning," arXiv preprint arXiv:1012.2599, 2010.
  3. J. Sneok, H. Larochelle, R. P. Adams, "Practical Bayesian optimization of machine learning algorithms," in Proc. Adv. Neural Infor. Process. Syst. (NIPS), Lake Tahoe, NV, USA, Dec. 2012, pp. 2951-2959.
  4. L. Li et al., "Hyperband: A novel bandit-based approach to hyperparameter optimization," J. Mach. Learning Research, vol. 18, Apr. 2018, pp. 1-52.
  5. A. Klein et al., "Fast Bayesian optimization of machine learning hyperparameters on large datasets," in Proc. Artif. Intell. Statistics (AISTATS), Fort Lauderdale, FL, USA, Apr. 2017, pp. 528-536.
  6. K. Swersky, J Snoek, R.P. Adams, "Multi-task Bayesian optimization," in Proc. Adv. Neural Infor. Process. Syst. (NIPS), Lake Tahoe, NV, USA, Dec. 2013, pp. 2004-2012.
  7. P. Hennig, C.J. Schuler, "Entropy search for information-efficient global optimization," J. Mach. Learning Research, vol. 13, June 2012, pp. 1809-1837.
  8. H. Bertrand, R. Ardon, I. Bloch, "Hyperparameter optimization of deep neural networks: combining hyperband with Bayesian model selection," in Proc. Conf. sur l'Apprentissage Automatique, France, June 2017, pp. 1-5.
  9. S. Falkner, A Klein, F. Hutter, "BOHB: robust and efficient hyperparameter optimization at scale," in Proc. Int. Conf. Mach. Learning (ICML), Stockholm, Sweden, 2018, pp. 1436-1445.
  10. J. Bergstra et al., "Algorithms for hyper-parameter optimization," in Proc. Adv. Neural Infor. Process. Syst. (NIPS), Granada, Spain, Dec. 2011, pp. 2546-2554.
  11. K. Jamieson, A. Talwalkar, "Non-stochastic best arm identification and hyperparameter optimization," in Proc. Artif. Intell. Statistics (AISTATS), Cadiz, Spain, 2016, pp. 240-248.
  12. J. Lorraine, D. Duvenaud, "Stochastic Hyper- parameter Optimization through Hypernetworks," arXiv preprint arXiv:1802.09419, 2018.
  13. A. Brock et al., "SMASH: one-shot model architecture search through hypernetworks," in Proc. Int. Conf. Learning Representations (ICLR), Vancouver, Canada, 2018, pp. 1-22.
  14. K.O. Stanley, R. Miikkulainen, "Evolving neural networks through augmenting topologies," Evolutionary computat., vol. 10, no. 2, 2002, pp. 99-127. https://doi.org/10.1162/106365602320169811
  15. E. Real et al., "Regularized evolution for image classifier architecture search," in Proc. Association Adv. Artif. Intell. (AAAI), Honolulu, HI, USA, 2019, pp. 1-16.
  16. B. Zoph et al., "Learning transferable architectures for scalable image recognition," in Proc. IEEE CVF Conf. Comput. Vision Pattern Recog.(CVPR), Salt Lake City, UT, USA, June 2018, pp. 8697-8710.
  17. H. Liu et al., "Hierarchical representations for efficient architecture search," in Proc. Int. Conf. Learning Representations (ICLR), Vancouver, Canada, 2018, pp. 1-13.
  18. Y. Chen et al., "Joint Neural Architecture Search and Quantization," arXiv preprint arXiv:1811.09426, 2018.
  19. B. Zoph, Q.V. Le, "Neural architecture search with reinforcement learning," in Proc. Int. Conf. Learning Representations (ICLR), Toulon, France, Apr. 2017, pp. 1-16.
  20. M. Tan et al., "Mnasnet: Platform-aware neural architecture search for mobile," arXiv preprint arXiv:1807.11626, 2018.
  21. S. Mark et al., "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, 2018, pp. 4510-4520.
  22. H. Pham et al., "Efficient neural architecture search via parameter sharing," in Proc. Int. Conf. Mach. Learning (ICML), Stockholm, Sweden, 2018, pp. 1-11.
  23. Saxena, Shreyas, Jakob Verbeek. "Convolutional neural fabrics," in Proc. Adv. Neural Infor. Process. Syst., Barcelona, Spain, 2016, pp. 4053-4061.
  24. K. Ahmed, L. Torresani, "Connectivity Learning in Multi-Branch Networks," arXiv preprint arXiv:1709.09582, 2017.
  25. R. Shin, C. Packer, D. Song, "Differential Neural Network Architecture Search," in Proc. Int. Conf. Learning Representations (ICLR), Vancouver, Canada, 2018, pp. 1-4.
  26. G. Bender et al., "Understanding and Simplifying One-Shot Architecture Search," in Proc. Int. Conf. Mach. Learning (ICML), Stockholm, Sweden, 2018, pp. 549-558.
  27. H. Liu, K. Simonyan, Y. Yang, "Darts: Differentiable Architecture Search," in Proc. Int. Conf. Learning Representations (ICLR), New Orleans, LA, USA, May 2019, pp. 1-13.
  28. H. Cai, L. Zhu, S. Han, "ProxylessNAS: direct neural architecture search on target task and hardware," in Proc. Int. Conf. Learning Representations (ICLR), New Orleans, LA, USA, May 2019, pp. 1-13.
  29. A. Gordon et al., "Morphnet: Fast & simple resource-constrained structure learning of deep networks," in Proc. IEEE Conf. Comput. Vision Pattern Recog. (CVPR), Salt Lake City, UT, USA, June 2018, pp. 1586-1595.
  30. A. Wong, "NetScore: towards universal metrics for large-scale performance analysis of deep neural networks for practical on-device edge usage," arXiv preprint arXiv:1806.05512, 2018.