DOI QR코드

DOI QR Code

Improvement of existing machine learning methods of digital signal by changing the step-size

학습률(Step-Size)변화에 따른 디지털 신호의 기계학습 방법 개선

  • Ji, Sangmin (Department of Mathematics, Chungnam National University) ;
  • Park, Jieun (Seongsan Liberal Arts College, Daegu University)
  • Received : 2019.11.26
  • Accepted : 2020.02.20
  • Published : 2020.02.28

Abstract

Machine learning is achieved by making a cost function from a given digital signal data and optimizing the cost function. The cost function here has local minimums in the cost function depending on the amount of digital signal data and the structure of the neural network. These local minimums make a problem that prevents learning. Among the many ways of solving these methods, our proposed method is to change the learning step-size. Unlike existed methods using the learning rate (step-size) as a fixed constant, the use of multivariate function as the cost function prevent unnecessary machine learning and find the best way to the minimum value. Numerical experiments show that the results of the proposed method improve about 3%(88.8%→91.5%) performance using the proposed method rather than the existed methods.

기계학습은 주어진 디지털 신호 Data로부터 비용함수를 만들고, 그 비용함수를 최소화함으로 학습이 이루어진다. 비용함수는 디지털 신호 Data의 양과 인공신경망의 구조에 따라 비용함수에 부분 최솟값(local minimum)들이 생기게 된다. 비용함수의 부분 최솟값들은 학습을 방해하는 요소가 된다. 이러한 방법을 해결하는 여러 방법 중 우리의 제안 방법은 학습률(Step-size)을 변화시키는 방법이다. 학습률을 고정된 상수로 이용하는 기존의 방법과는 다르게 비용함수를 이용한 다변수함수를 이용함으로써 불필요한 기계학습이 이루어지는 것을 방지할 수 있으며, 최솟값으로 가는 최적의 길을 찾을 수 있다. 수치적 실험을 통하여 기존의 방법보다 우리가 제안하는 방법을 이용하여 약 3%(88.8%→91.5%)의 성능이 향상하는 결과를 얻었다.

Keywords

References

  1. D. Yi, S. Ji & S. Bu. (2019). An Enhanced Optimization Scheme Based on Gradient Descent Methods for Machine Learning. symmetry, 11 (7), 942-959. https://doi.org/10.3390/sym11070942
  2. H. Zulkifli. (2018). Understanding Learning Rates and How It Improves Performance in Deep Learning. [Online] https://towardsdatascience.com/understanding-learning-rates-and-how-it-improves-performance-in-deep-learning-d0d4059c1c10?gi=e082fbb7c7a9.
  3. S. Lau. (2017). Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning. Towards Data Science. [Online] https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1.
  4. G. Aurelien. (2017). Gradient Descent . Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly. pp. 113-124. ISBN 978-1-4919-6229-9.
  5. J. Duchi, E. Hazan & Y. Singer. (2011). Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res, 12 , 2121-2159.
  6. Y. LeCun, L. Bottou, Y. Bengio & P. Haffner. (1998). Gradient-based learning applied to document recognition. Proc. IEEE, 86 , 2278-2324. https://doi.org/10.1109/5.726791
  7. R. Pascanu & Y. Bengio. (2013). Revisiting natural gradient for deep networks. arXiv:1301.3584.
  8. J. Sohl-Dickstein, B. Poole & S. Ganguli. (2014). Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods. In Proceedings of the 31st International Conference on Machine Learning. (pp. 604-612). Beijing, China.
  9. P. Baldi & K. Hornik. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2(1), 53-58. https://doi.org/10.1016/0893-6080(89)90014-2
  10. M. Zinkevich. (2003). Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the Twentieth International Conference on Machine Learning. (pp. 928-936). Washington, DC, USA.
  11. C. T. Kelley. (1995). Iterative methods for linear and nonlinear equations(Volume 16). In Frontiers in Applied Mathematics; SIAM: Philadelphia, PA, USA.
  12. I. Sutskever, J. Martens, G. Dahl & G. E. Hinton. (2013). On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning. (pp. 1139-1147). Atlanta, GA, USA.
  13. M. D. Zeiler. (2012). Adadelta: An adaptive learning rate method. arXiv:1212.5701.
  14. D. P. Kingma & J. L. Ba. (2015). Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations. (pp. 7-9). San Diego, CA, USA.
  15. M. J. Kochenderfer & T. A. Wheeler. (2019). Algorithms for Optimization. London : The MIT Press Cambridge.