• Title/Summary/Keyword: AdaGrad

Search Result 5, Processing Time 0.016 seconds

Performance Evaluation of Machine Learning Optimizers (기계학습 옵티마이저 성능 평가)

  • Joo, Gihun;Park, Chihyun;Im, Hyeonseung
    • Journal of IKEEE
    • /
    • v.24 no.3
    • /
    • pp.766-776
    • /
    • 2020
  • Recently, as interest in machine learning (ML) has increased and research using ML has become active, it is becoming more important to find an optimal hyperparameter combination for various ML models. In this paper, among various hyperparameters, we focused on ML optimizers, and measured and compared the performance of major optimizers using various datasets. In particular, we compared the performance of nine optimizers ranging from SGD, which is the most basic, to Momentum, NAG, AdaGrad, RMSProp, AdaDelta, Adam, AdaMax, and Nadam, using the MNIST, CIFAR-10, IRIS, TITANIC, and Boston Housing Price datasets. Experimental results showed that when Adam or Nadam was used, the loss of various ML models decreased most rapidly and their F1 score was also increased. Meanwhile, AdaMax showed a lot of instability during training and AdaDelta showed slower convergence speed and lower performance than other optimizers.

Optimal Algorithm and Number of Neurons in Deep Learning (딥러닝 학습에서 최적의 알고리즘과 뉴론수 탐색)

  • Jang, Ha-Young;You, Eun-Kyung;Kim, Hyeock-Jin
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.389-396
    • /
    • 2022
  • Deep Learning is based on a perceptron, and is currently being used in various fields such as image recognition, voice recognition, object detection, and drug development. Accordingly, a variety of learning algorithms have been proposed, and the number of neurons constituting a neural network varies greatly among researchers. This study analyzed the learning characteristics according to the number of neurons of the currently used SGD, momentum methods, AdaGrad, RMSProp, and Adam methods. To this end, a neural network was constructed with one input layer, three hidden layers, and one output layer. ReLU was applied to the activation function, cross entropy error (CEE) was applied to the loss function, and MNIST was used for the experimental dataset. As a result, it was concluded that the number of neurons 100-300, the algorithm Adam, and the number of learning (iteraction) 200 would be the most efficient in deep learning learning. This study will provide implications for the algorithm to be developed and the reference value of the number of neurons given new learning data in the future.

Simulating the performance of the reinforced concrete beam using artificial intelligence

  • Yong Cao;Ruizhe Qiu;Wei Qi
    • Advances in concrete construction
    • /
    • v.15 no.4
    • /
    • pp.269-286
    • /
    • 2023
  • In the present study, we aim to utilize the numerical solution frequency results of functionally graded beam under thermal and dynamic loadings to train and test an artificial neural network. In this regard, shear deformable functionally-graded beam structure is considered for obtaining the natural frequency in different conditions of boundary and material grading indices. In this regard, both analytical and numerical solutions based on Navier's approach and differential quadrature method are presented to obtain effects of different parameters on the natural frequency of the structure. Further, the numerical results are utilized to train an artificial neural network (ANN) using AdaGrad optimization algorithm. Finally, the results of the ANN and other solution procedure are presented and comprehensive parametric study is presented to observe effects of geometrical, material and boundary conditions of the free oscillation frequency of the functionally graded beam structure.

Helmet and Mask Classification for Personnel Safety Using a Deep Learning (딥러닝 기반 직원 안전용 헬멧과 마스크 분류)

  • Shokhrukh, Bibalaev;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.3
    • /
    • pp.473-482
    • /
    • 2022
  • Wearing a mask is also necessary to limit the risk of infection in today's era of COVID-19 and wearing a helmet is inevitable for the safety of personnel who works in a dangerous working environment such as construction sites. This paper proposes an effective deep learning model, HelmetMask-Net, to classify both Helmet and Mask. The proposed HelmetMask-Net is based on CNN which consists of data processing, convolution layers, max pooling layers and fully connected layers with four output classifications, and 4 classes for Helmet, Mask, Helmet & Mask, and no Helmet & no Mask are classified. The proposed HelmatMask-Net has been chosen with 2 convolutional layers and AdaGrad optimizer by various simulations for accuracy, optimizer and the number of hyperparameters. Simulation results show the accuracy of 99% and the best performance compared to other models. The results of this paper would enhance the safety of personnel in this era of COVID-19.

Comparison of Gradient Descent for Deep Learning (딥러닝을 위한 경사하강법 비교)

  • Kang, Min-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.189-194
    • /
    • 2020
  • This paper analyzes the gradient descent method, which is the one most used for learning neural networks. Learning means updating a parameter so the loss function is at its minimum. The loss function quantifies the difference between actual and predicted values. The gradient descent method uses the slope of the loss function to update the parameter to minimize error, and is currently used in libraries that provide the best deep learning algorithms. However, these algorithms are provided in the form of a black box, making it difficult to identify the advantages and disadvantages of various gradient descent methods. This paper analyzes the characteristics of the stochastic gradient descent method, the momentum method, the AdaGrad method, and the Adadelta method, which are currently used gradient descent methods. The experimental data used a modified National Institute of Standards and Technology (MNIST) data set that is widely used to verify neural networks. The hidden layer consists of two layers: the first with 500 neurons, and the second with 300. The activation function of the output layer is the softmax function, and the rectified linear unit function is used for the remaining input and hidden layers. The loss function uses cross-entropy error.