• Title/Summary/Keyword: Optimizers

Search Result 38, Processing Time 0.031 seconds

Pragmatic Assessment of Optimizers in Deep Learning

  • Ajeet K. Jain;PVRD Prasad Rao ;K. Venkatesh Sharma
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.115-128
    • /
    • 2023
  • Deep learning has been incorporating various optimization techniques motivated by new pragmatic optimizing algorithm advancements and their usage has a central role in Machine learning. In recent past, new avatars of various optimizers are being put into practice and their suitability and applicability has been reported on various domains. The resurgence of novelty starts from Stochastic Gradient Descent to convex and non-convex and derivative-free approaches. In the contemporary of these horizons of optimizers, choosing a best-fit or appropriate optimizer is an important consideration in deep learning theme as these working-horse engines determines the final performance predicted by the model. Moreover with increasing number of deep layers tantamount higher complexity with hyper-parameter tuning and consequently need to delve for a befitting optimizer. We empirically examine most popular and widely used optimizers on various data sets and networks-like MNIST and GAN plus others. The pragmatic comparison focuses on their similarities, differences and possibilities of their suitability for a given application. Additionally, the recent optimizer variants are highlighted with their subtlety. The article emphasizes on their critical role and pinpoints buttress options while choosing among them.

Implementation and Analysis of Optimizers on Tuple codes (튜플 코드 상에서의 최적화기 구현과 분석)

  • 송진국
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.3 no.4
    • /
    • pp.723-736
    • /
    • 1999
  • Code optimization phase in a compiler are very important because the phase reduces the running time and the storage size of machine codes. I developed flow analyzers and optimizers on intermediate codes. The flow analyzers generate control-flow and data-flow information. The optimizers transform the intermediate codes into the improved codes using this information. This paper describes the development of flow analyzers and optimizers. I also examined the execution performance, the cost and the dependency of each optimization.

  • PDF

Comparison of Different Deep Learning Optimizers for Modeling Photovoltaic Power

  • Poudel, Prasis;Bae, Sang Hyun;Jang, Bongseog
    • Journal of Integrative Natural Science
    • /
    • v.11 no.4
    • /
    • pp.204-208
    • /
    • 2018
  • Comparison of different optimizer performance in photovoltaic power modeling using artificial neural deep learning techniques is described in this paper. Six different deep learning optimizers are tested for Long-Short-Term Memory networks in this study. The optimizers are namely Adam, Stochastic Gradient Descent, Root Mean Square Propagation, Adaptive Gradient, and some variants such as Adamax and Nadam. For comparing the optimization techniques, high and low fluctuated photovoltaic power output are examined and the power output is real data obtained from the site at Mokpo university. Using Python Keras version, we have developed the prediction program for the performance evaluation of the optimizations. The prediction error results of each optimizer in both high and low power cases shows that the Adam has better performance compared to the other optimizers.

Development of new artificial neural network optimizer to improve water quality index prediction performance (수질 지수 예측성능 향상을 위한 새로운 인공신경망 옵티마이저의 개발)

  • Ryu, Yong Min;Kim, Young Nam;Lee, Dae Won;Lee, Eui Hoon
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.2
    • /
    • pp.73-85
    • /
    • 2024
  • Predicting water quality of rivers and reservoirs is necessary for the management of water resources. Artificial Neural Networks (ANNs) have been used in many studies to predict water quality with high accuracy. Previous studies have used Gradient Descent (GD)-based optimizers as an optimizer, an operator of ANN that searches parameters. However, GD-based optimizers have the disadvantages of the possibility of local optimal convergence and absence of a solution storage and comparison structure. This study developed improved optimizers to overcome the disadvantages of GD-based optimizers. Proposed optimizers are optimizers that combine adaptive moments (Adam) and Nesterov-accelerated adaptive moments (Nadam), which have low learning errors among GD-based optimizers, with Harmony Search (HS) or Novel Self-adaptive Harmony Search (NSHS). To evaluate the performance of Long Short-Term Memory (LSTM) using improved optimizers, the water quality data from the Dasan water quality monitoring station were used for training and prediction. Comparing the learning results, Mean Squared Error (MSE) of LSTM using Nadam combined with NSHS (NadamNSHS) was the lowest at 0.002921. In addition, the prediction rankings according to MSE and R2 for the four water quality indices for each optimizer were compared. Comparing the average of ranking for each optimizer, it was confirmed that LSTM using NadamNSHS was the highest at 2.25.

Performance Evaluation of Machine Learning Optimizers (기계학습 옵티마이저 성능 평가)

  • Joo, Gihun;Park, Chihyun;Im, Hyeonseung
    • Journal of IKEEE
    • /
    • v.24 no.3
    • /
    • pp.766-776
    • /
    • 2020
  • Recently, as interest in machine learning (ML) has increased and research using ML has become active, it is becoming more important to find an optimal hyperparameter combination for various ML models. In this paper, among various hyperparameters, we focused on ML optimizers, and measured and compared the performance of major optimizers using various datasets. In particular, we compared the performance of nine optimizers ranging from SGD, which is the most basic, to Momentum, NAG, AdaGrad, RMSProp, AdaDelta, Adam, AdaMax, and Nadam, using the MNIST, CIFAR-10, IRIS, TITANIC, and Boston Housing Price datasets. Experimental results showed that when Adam or Nadam was used, the loss of various ML models decreased most rapidly and their F1 score was also increased. Meanwhile, AdaMax showed a lot of instability during training and AdaDelta showed slower convergence speed and lower performance than other optimizers.

AN EXACT LOGARITHMIC-EXPONENTIAL MULTIPLIER PENALTY FUNCTION

  • Lian, Shu-jun
    • Journal of applied mathematics & informatics
    • /
    • v.28 no.5_6
    • /
    • pp.1477-1487
    • /
    • 2010
  • In this paper, we give a solving approach based on a logarithmic-exponential multiplier penalty function for the constrained minimization problem. It is proved exact in the sense that the local optimizers of a nonlinear problem are precisely the local optimizers of the logarithmic-exponential multiplier penalty problem.

Supervised learning-based DDoS attacks detection: Tuning hyperparameters

  • Kim, Meejoung
    • ETRI Journal
    • /
    • v.41 no.5
    • /
    • pp.560-573
    • /
    • 2019
  • Two supervised learning algorithms, a basic neural network and a long short-term memory recurrent neural network, are applied to traffic including DDoS attacks. The joint effects of preprocessing methods and hyperparameters for machine learning on performance are investigated. Values representing attack characteristics are extracted from datasets and preprocessed by two methods. Binary classification and two optimizers are used. Some hyperparameters are obtained exhaustively for fast and accurate detection, while others are fixed with constants to account for performance and data characteristics. An experiment is performed via TensorFlow on three traffic datasets. Three scenarios are considered to investigate the effects of learning former traffic on sequential traffic analysis and the effects of learning one dataset on application to another dataset, and determine whether the algorithms can be used for recent attack traffic. Experimental results show that the used preprocessing methods, neural network architectures and hyperparameters, and the optimizers are appropriate for DDoS attack detection. The obtained results provide a criterion for the detection accuracy of attacks.

Performance Comparison of the Optimizers in a Faster R-CNN Model for Object Detection of Metaphase Chromosomes (중기 염색체 객체 검출을 위한 Faster R-CNN 모델의 최적화기 성능 비교)

  • Jung, Wonseok;Lee, Byeong-Soo;Seo, Jeongwook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1357-1363
    • /
    • 2019
  • In this paper, we compares the performance of the gredient descent optimizers of the Faster Region-based Convolutional Neural Network (R-CNN) model for the chromosome object detection in digital images composed of human metaphase chromosomes. In faster R-CNN, the gradient descent optimizer is used to minimize the objective function of the region proposal network (RPN) module and the classification score and bounding box regression blocks. The gradient descent optimizer. Through performance comparisons among these four gradient descent optimizers in our experiments, we found that the Adamax optimizer could achieve the mean average precision (mAP) of about 52% when considering faster R-CNN with a base network, VGG16. In case of faster R-CNN with a base network, ResNet50, the Adadelta optimizer could achieve the mAP of about 58%.

A Unicode based Deep Handwritten Character Recognition model for Telugu to English Language Translation

  • BV Subba Rao;J. Nageswara Rao;Bandi Vamsi;Venkata Nagaraju Thatha;Katta Subba Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.101-112
    • /
    • 2024
  • Telugu language is considered as fourth most used language in India especially in the regions of Andhra Pradesh, Telangana, Karnataka etc. In international recognized countries also, Telugu is widely growing spoken language. This language comprises of different dependent and independent vowels, consonants and digits. In this aspect, the enhancement of Telugu Handwritten Character Recognition (HCR) has not been propagated. HCR is a neural network technique of converting a documented image to edited text one which can be used for many other applications. This reduces time and effort without starting over from the beginning every time. In this work, a Unicode based Handwritten Character Recognition(U-HCR) is developed for translating the handwritten Telugu characters into English language. With the use of Centre of Gravity (CG) in our model we can easily divide a compound character into individual character with the help of Unicode values. For training this model, we have used both online and offline Telugu character datasets. To extract the features in the scanned image we used convolutional neural network along with Machine Learning classifiers like Random Forest and Support Vector Machine. Stochastic Gradient Descent (SGD), Root Mean Square Propagation (RMS-P) and Adaptative Moment Estimation (ADAM)optimizers are used in this work to enhance the performance of U-HCR and to reduce the loss function value. This loss value reduction can be possible with optimizers by using CNN. In both online and offline datasets, proposed model showed promising results by maintaining the accuracies with 90.28% for SGD, 96.97% for RMS-P and 93.57% for ADAM respectively.

Employing TLBO and SCE for optimal prediction of the compressive strength of concrete

  • Zhao, Yinghao;Moayedi, Hossein;Bahiraei, Mehdi;Foong, Loke Kok
    • Smart Structures and Systems
    • /
    • v.26 no.6
    • /
    • pp.753-763
    • /
    • 2020
  • The early prediction of Compressive Strength of Concrete (CSC) is a significant task in the civil engineering construction projects. This study, therefore, is dedicated to introducing two novel hybrids of neural computing, namely Shuffled Complex Evolution (SCE) and Teaching-Learning-Based Optimization (TLBO) for predicting the CSC. The algorithms are applied to a Multi-Layer Perceptron (MLP) network to create the SCE-MLP and TLBO-MLP ensembles. The results revealed that, first, intelligent models can properly handle analyzing and generalizing the non-linear relationship between the CSC and its influential parameters. For example, the smallest and largest values of the CSC were 17.19 and 58.53 MPa, and the outputs of the MLP, SCE-MLP, and TLBO-MLP range in [17.61, 54.36], [17.69, 55.55] and [18.07, 53.83], respectively. Second, applying the SCE and TLBO optimizers resulted in increasing the correlation of the MLP products from 93.58 to 97.32 and 97.22%, respectively. The prediction error was also reduced by around 34 and 31% which indicates the high efficiency of these algorithms. Moreover, regarding the computation time needed to implement the SCE-MLP and TLBO-MLP models, the SCE is a considerably more time-efficient optimizer. Nevertheless, both suggested models can be promising substitutes for laboratory and destructive CSC evaluative models.