Improvement of existing machine learning methods of digital signal by changing the step-size

Ji, Sangmin;Park, Jieun;

doi:10.14400/JDC.2020.18.2.261

디지털융복합연구 (Journal of Digital Convergence)

제18권2호
/
Pages.261-268
/
2020
/
2713-6434(pISSN)
/
2713-6442(eISSN)

한국디지털정책학회 (The Society of Digital Policy and Management)

DOI QR Code

학습률(Step-Size)변화에 따른 디지털 신호의 기계학습 방법 개선

Improvement of existing machine learning methods of digital signal by changing the step-size

지상민 (충남대학교 수학과) ;
박지은 (대구대학교 인문교양대학)

Ji, Sangmin (Department of Mathematics, Chungnam National University) ;
Park, Jieun (Seongsan Liberal Arts College, Daegu University)

투고 : 2019.11.26
심사 : 2020.02.20
발행 : 2020.02.28

https://doi.org/10.14400/JDC.2020.18.2.261 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

기계학습은 주어진 디지털 신호 Data로부터 비용함수를 만들고, 그 비용함수를 최소화함으로 학습이 이루어진다. 비용함수는 디지털 신호 Data의 양과 인공신경망의 구조에 따라 비용함수에 부분 최솟값(local minimum)들이 생기게 된다. 비용함수의 부분 최솟값들은 학습을 방해하는 요소가 된다. 이러한 방법을 해결하는 여러 방법 중 우리의 제안 방법은 학습률(Step-size)을 변화시키는 방법이다. 학습률을 고정된 상수로 이용하는 기존의 방법과는 다르게 비용함수를 이용한 다변수함수를 이용함으로써 불필요한 기계학습이 이루어지는 것을 방지할 수 있으며, 최솟값으로 가는 최적의 길을 찾을 수 있다. 수치적 실험을 통하여 기존의 방법보다 우리가 제안하는 방법을 이용하여 약 3%(88.8%→91.5%)의 성능이 향상하는 결과를 얻었다.

Machine learning is achieved by making a cost function from a given digital signal data and optimizing the cost function. The cost function here has local minimums in the cost function depending on the amount of digital signal data and the structure of the neural network. These local minimums make a problem that prevents learning. Among the many ways of solving these methods, our proposed method is to change the learning step-size. Unlike existed methods using the learning rate (step-size) as a fixed constant, the use of multivariate function as the cost function prevent unnecessary machine learning and find the best way to the minimum value. Numerical experiments show that the results of the proposed method improve about 3%(88.8%→91.5%) performance using the proposed method rather than the existed methods.

키워드

참고문헌

D. Yi, S. Ji & S. Bu. (2019). An Enhanced Optimization Scheme Based on Gradient Descent Methods for Machine Learning. symmetry, 11 (7), 942-959. https://doi.org/10.3390/sym11070942
H. Zulkifli. (2018). Understanding Learning Rates and How It Improves Performance in Deep Learning. [Online] https://towardsdatascience.com/understanding-learning-rates-and-how-it-improves-performance-in-deep-learning-d0d4059c1c10?gi=e082fbb7c7a9.
S. Lau. (2017). Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning. Towards Data Science. [Online] https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1.
G. Aurelien. (2017). Gradient Descent . Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly. pp. 113-124. ISBN 978-1-4919-6229-9.
J. Duchi, E. Hazan & Y. Singer. (2011). Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res, 12 , 2121-2159.
Y. LeCun, L. Bottou, Y. Bengio & P. Haffner. (1998). Gradient-based learning applied to document recognition. Proc. IEEE, 86 , 2278-2324. https://doi.org/10.1109/5.726791
R. Pascanu & Y. Bengio. (2013). Revisiting natural gradient for deep networks. arXiv:1301.3584.
J. Sohl-Dickstein, B. Poole & S. Ganguli. (2014). Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods. In Proceedings of the 31st International Conference on Machine Learning. (pp. 604-612). Beijing, China.
P. Baldi & K. Hornik. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2(1), 53-58. https://doi.org/10.1016/0893-6080(89)90014-2
M. Zinkevich. (2003). Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the Twentieth International Conference on Machine Learning. (pp. 928-936). Washington, DC, USA.
C. T. Kelley. (1995). Iterative methods for linear and nonlinear equations(Volume 16). In Frontiers in Applied Mathematics; SIAM: Philadelphia, PA, USA.
I. Sutskever, J. Martens, G. Dahl & G. E. Hinton. (2013). On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning. (pp. 1139-1147). Atlanta, GA, USA.
M. D. Zeiler. (2012). Adadelta: An adaptive learning rate method. arXiv:1212.5701.
D. P. Kingma & J. L. Ba. (2015). Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations. (pp. 7-9). San Diego, CA, USA.
M. J. Kochenderfer & T. A. Wheeler. (2019). Algorithms for Optimization. London : The MIT Press Cambridge.

디지털융복합연구 (Journal of Digital Convergence)

학습률(Step-Size)변화에 따른 디지털 신호의 기계학습 방법 개선

Improvement of existing machine learning methods of digital signal by changing the step-size

초록

키워드

참고문헌

자세히 찾기