DOI QR코드

DOI QR Code

Optimization of Model based on Relu Activation Function in MLP Neural Network Model

  • Ye Rim Youn (Division of Computer Engineering, Baekseok University) ;
  • Jinkeun Hong (Division of Advanced IT, Beakseok University)
  • Received : 2024.04.17
  • Accepted : 2024.04.30
  • Published : 2024.06.30

Abstract

This paper focuses on improving accuracy in constrained computing settings by employing the ReLU (Rectified Linear Unit) activation function. The research conducted involves modifying parameters of the ReLU function and comparing performance in terms of accuracy and computational time. This paper specifically focuses on optimizing ReLU in the context of a Multilayer Perceptron (MLP) by determining the ideal values for features such as the dimensions of the linear layers and the learning rate (Ir). In order to optimize performance, the paper experiments with adjusting parameters like the size dimensions of linear layers and Ir values to induce the best performance outcomes. The experimental results show that using ReLU alone yielded the highest accuracy of 96.7% when the dimension sizes were 30 - 10 and the Ir value was 1. When combining ReLU with the Adam optimizer, the optimal model configuration had dimension sizes of 60 - 40 - 10, and an Ir value of 0.001, which resulted in the highest accuracy of 97.07%.

Keywords

References

  1. J. Chen, G. Serpen, et al., "Artificial Intelligence and Robotics for COVID-19: Applications and Innovations," IEEE Access, vol. 9, pp. 46145-46153, Oct. 2021. DOI: [10.1109/ACCESS.2021.3102599]
  2. E. H. Lundquist, K. Walch, "AI in daily life: What every consumer needs to know," Deloitte Insights, vol. 6, no. 2, pp. 55-63, Feb. 2020.
  3. R.-S. Liu, B.-J. Ho, et al., "AI Benchmark: Running Deep Neural Networks on a Cluster of Mobile Phones," IEEE Transactions on Mobile Computing, vol. 19, no. 9, pp. 2102-2114, Sep. 2020. DOI: [10.1109/TMC.2019.2912548]
  4. Jia-Bin Huang, Abhishek Singh, Narendra Ahuja, "Parametric Rectified Linear Unit for Deep Convolutional Neural Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, No. 6, pp. 1134-1141, Jun. 2016
  5. Djork-Arne Clevert, Thomas Unterthiner, Sepp Hochreiter, "Exponential Linear Units for Deep Convolutional Neural Networks," International Conference on Learning Representations (ICLR), May 2016
  6. G. Cybenko, "Multilayer feedforward networks are universal approximators," Neural Networks, vol. 2, no. 4, pp. 359-366, 1989. DOI: [10.1016/0893-6080(89)90020-8]
  7. Y. LeCun, Y. Bengio, G. Hinton, "Deep Learning," Nature, vol. 521, no. 7553, pp. 436-444, May 2015. DOI: [10.1038/nature14539]
  8. A. Krizhevsky, "Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms," [Accessed: Apr. 08, 2022]. Available: [https://github.com/zalandoresearch/fashion-mnist]
  9. V. Nair, G. E. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 807-814, Jun. 2010. [Accessed: Apr. 08, 2022]. Available: [http://www.cs.toronto.edu/~hinton/absps/reluICML.pdf]
  10. W. Wen, "Comparison of Rectified Linear Units (ReLU) and S-shaped Rectified Linear Activation Units (SReLU) in Sparse Coding," arXiv, [Accessed: Apr. 08, 2022]. Available: [https://arxiv.org/abs/1709.07417]
  11. D. P. Kingma, J. Ba, "ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION," arXiv, [Accessed: A pr. 08, 2022]. Available: [https://arxiv.org/abs/1412.6980]