Search | Korea Science

Pragmatic Assessment of Optimizers in Deep Learning

Ajeet K. Jain;PVRD Prasad Rao ;K. Venkatesh Sharma
- International Journal of Computer Science & Network Security
- /
- v.23 no.10
- /
- pp.115-128
- /
- 2023
Deep learning has been incorporating various optimization techniques motivated by new pragmatic optimizing algorithm advancements and their usage has a central role in Machine learning. In recent past, new avatars of various optimizers are being put into practice and their suitability and applicability has been reported on various domains. The resurgence of novelty starts from Stochastic Gradient Descent to convex and non-convex and derivative-free approaches. In the contemporary of these horizons of optimizers, choosing a best-fit or appropriate optimizer is an important consideration in deep learning theme as these working-horse engines determines the final performance predicted by the model. Moreover with increasing number of deep layers tantamount higher complexity with hyper-parameter tuning and consequently need to delve for a befitting optimizer. We empirically examine most popular and widely used optimizers on various data sets and networks-like MNIST and GAN plus others. The pragmatic comparison focuses on their similarities, differences and possibilities of their suitability for a given application. Additionally, the recent optimizer variants are highlighted with their subtlety. The article emphasizes on their critical role and pinpoints buttress options while choosing among them.
https://doi.org/10.22937/IJCSNS.2023.23.10.15 인용 PDF

Implementation and Analysis of Optimizers on Tuple codes (튜플 코드 상에서의 최적화기 구현과 분석)

송진국
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.3 no.4
- /
- pp.723-736
- /
- 1999
Code optimization phase in a compiler are very important because the phase reduces the running time and the storage size of machine codes. I developed flow analyzers and optimizers on intermediate codes. The flow analyzers generate control-flow and data-flow information. The optimizers transform the intermediate codes into the improved codes using this information. This paper describes the development of flow analyzers and optimizers. I also examined the execution performance, the cost and the dependency of each optimization.
PDF

Comparison of Different Deep Learning Optimizers for Modeling Photovoltaic Power

Poudel, Prasis;Bae, Sang Hyun;Jang, Bongseog
- Journal of Integrative Natural Science
- /
- v.11 no.4
- /
- pp.204-208
- /
- 2018
Comparison of different optimizer performance in photovoltaic power modeling using artificial neural deep learning techniques is described in this paper. Six different deep learning optimizers are tested for Long-Short-Term Memory networks in this study. The optimizers are namely Adam, Stochastic Gradient Descent, Root Mean Square Propagation, Adaptive Gradient, and some variants such as Adamax and Nadam. For comparing the optimization techniques, high and low fluctuated photovoltaic power output are examined and the power output is real data obtained from the site at Mokpo university. Using Python Keras version, we have developed the prediction program for the performance evaluation of the optimizations. The prediction error results of each optimizer in both high and low power cases shows that the Adam has better performance compared to the other optimizers.
https://doi.org/10.13160/ricns.2018.11.4.204 인용 PDF KSCI

Development of new artificial neural network optimizer to improve water quality index prediction performance (수질 지수 예측성능 향상을 위한 새로운 인공신경망 옵티마이저의 개발)

Ryu, Yong Min;Kim, Young Nam;Lee, Dae Won;Lee, Eui Hoon
- Journal of Korea Water Resources Association
- /
- v.57 no.2
- /
- pp.73-85
- /
- 2024
Predicting water quality of rivers and reservoirs is necessary for the management of water resources. Artificial Neural Networks (ANNs) have been used in many studies to predict water quality with high accuracy. Previous studies have used Gradient Descent (GD)-based optimizers as an optimizer, an operator of ANN that searches parameters. However, GD-based optimizers have the disadvantages of the possibility of local optimal convergence and absence of a solution storage and comparison structure. This study developed improved optimizers to overcome the disadvantages of GD-based optimizers. Proposed optimizers are optimizers that combine adaptive moments (Adam) and Nesterov-accelerated adaptive moments (Nadam), which have low learning errors among GD-based optimizers, with Harmony Search (HS) or Novel Self-adaptive Harmony Search (NSHS). To evaluate the performance of Long Short-Term Memory (LSTM) using improved optimizers, the water quality data from the Dasan water quality monitoring station were used for training and prediction. Comparing the learning results, Mean Squared Error (MSE) of LSTM using Nadam combined with NSHS (NadamNSHS) was the lowest at 0.002921. In addition, the prediction rankings according to MSE and R² for the four water quality indices for each optimizer were compared. Comparing the average of ranking for each optimizer, it was confirmed that LSTM using NadamNSHS was the highest at 2.25.
https://doi.org/10.3741/JKWRA.2024.57.2.73 인용 PDF

Performance Evaluation of Machine Learning Optimizers (기계학습 옵티마이저 성능 평가)

Joo, Gihun;Park, Chihyun;Im, Hyeonseung
- Journal of IKEEE
- /
- v.24 no.3
- /
- pp.766-776
- /
- 2020
Recently, as interest in machine learning (ML) has increased and research using ML has become active, it is becoming more important to find an optimal hyperparameter combination for various ML models. In this paper, among various hyperparameters, we focused on ML optimizers, and measured and compared the performance of major optimizers using various datasets. In particular, we compared the performance of nine optimizers ranging from SGD, which is the most basic, to Momentum, NAG, AdaGrad, RMSProp, AdaDelta, Adam, AdaMax, and Nadam, using the MNIST, CIFAR-10, IRIS, TITANIC, and Boston Housing Price datasets. Experimental results showed that when Adam or Nadam was used, the loss of various ML models decreased most rapidly and their F1 score was also increased. Meanwhile, AdaMax showed a lot of instability during training and AdaDelta showed slower convergence speed and lower performance than other optimizers.
https://doi.org/10.7471/ikeee.2020.24.3.766 인용 PDF KSCI

AN EXACT LOGARITHMIC-EXPONENTIAL MULTIPLIER PENALTY FUNCTION

Lian, Shu-jun
- Journal of applied mathematics & informatics
- /
- v.28 no.5_6
- /
- pp.1477-1487
- /
- 2010
In this paper, we give a solving approach based on a logarithmic-exponential multiplier penalty function for the constrained minimization problem. It is proved exact in the sense that the local optimizers of a nonlinear problem are precisely the local optimizers of the logarithmic-exponential multiplier penalty problem.
PDF KSCI

Supervised learning-based DDoS attacks detection: Tuning hyperparameters

Kim, Meejoung
- ETRI Journal
- /
- v.41 no.5
- /
- pp.560-573
- /
- 2019
Two supervised learning algorithms, a basic neural network and a long short-term memory recurrent neural network, are applied to traffic including DDoS attacks. The joint effects of preprocessing methods and hyperparameters for machine learning on performance are investigated. Values representing attack characteristics are extracted from datasets and preprocessed by two methods. Binary classification and two optimizers are used. Some hyperparameters are obtained exhaustively for fast and accurate detection, while others are fixed with constants to account for performance and data characteristics. An experiment is performed via TensorFlow on three traffic datasets. Three scenarios are considered to investigate the effects of learning former traffic on sequential traffic analysis and the effects of learning one dataset on application to another dataset, and determine whether the algorithms can be used for recent attack traffic. Experimental results show that the used preprocessing methods, neural network architectures and hyperparameters, and the optimizers are appropriate for DDoS attack detection. The obtained results provide a criterion for the detection accuracy of attacks.
https://doi.org/10.4218/etrij.2019-0156 인용 PDF KSCI

Performance Comparison of the Optimizers in a Faster R-CNN Model for Object Detection of Metaphase Chromosomes (중기 염색체 객체 검출을 위한 Faster R-CNN 모델의 최적화기 성능 비교)

Jung, Wonseok;Lee, Byeong-Soo;Seo, Jeongwook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.23 no.11
- /
- pp.1357-1363
- /
- 2019
In this paper, we compares the performance of the gredient descent optimizers of the Faster Region-based Convolutional Neural Network (R-CNN) model for the chromosome object detection in digital images composed of human metaphase chromosomes. In faster R-CNN, the gradient descent optimizer is used to minimize the objective function of the region proposal network (RPN) module and the classification score and bounding box regression blocks. The gradient descent optimizer. Through performance comparisons among these four gradient descent optimizers in our experiments, we found that the Adamax optimizer could achieve the mean average precision (mAP) of about 52% when considering faster R-CNN with a base network, VGG16. In case of faster R-CNN with a base network, ResNet50, the Adadelta optimizer could achieve the mAP of about 58%.
https://doi.org/10.6109/jkiice.2019.23.11.1357 인용 PDF KSCI

A Unicode based Deep Handwritten Character Recognition model for Telugu to English Language Translation

BV Subba Rao;J. Nageswara Rao;Bandi Vamsi;Venkata Nagaraju Thatha;Katta Subba Rao
- International Journal of Computer Science & Network Security
- /
- v.24 no.2
- /
- pp.101-112
- /
- 2024
Telugu language is considered as fourth most used language in India especially in the regions of Andhra Pradesh, Telangana, Karnataka etc. In international recognized countries also, Telugu is widely growing spoken language. This language comprises of different dependent and independent vowels, consonants and digits. In this aspect, the enhancement of Telugu Handwritten Character Recognition (HCR) has not been propagated. HCR is a neural network technique of converting a documented image to edited text one which can be used for many other applications. This reduces time and effort without starting over from the beginning every time. In this work, a Unicode based Handwritten Character Recognition(U-HCR) is developed for translating the handwritten Telugu characters into English language. With the use of Centre of Gravity (CG) in our model we can easily divide a compound character into individual character with the help of Unicode values. For training this model, we have used both online and offline Telugu character datasets. To extract the features in the scanned image we used convolutional neural network along with Machine Learning classifiers like Random Forest and Support Vector Machine. Stochastic Gradient Descent (SGD), Root Mean Square Propagation (RMS-P) and Adaptative Moment Estimation (ADAM)optimizers are used in this work to enhance the performance of U-HCR and to reduce the loss function value. This loss value reduction can be possible with optimizers by using CNN. In both online and offline datasets, proposed model showed promising results by maintaining the accuracies with 90.28% for SGD, 96.97% for RMS-P and 93.57% for ADAM respectively.
https://doi.org/10.22937/IJCSNS.2024.24.2.12 인용 PDF

Employing TLBO and SCE for optimal prediction of the compressive strength of concrete

Zhao, Yinghao;Moayedi, Hossein;Bahiraei, Mehdi;Foong, Loke Kok
- Smart Structures and Systems
- /
- v.26 no.6
- /
- pp.753-763
- /
- 2020
The early prediction of Compressive Strength of Concrete (CSC) is a significant task in the civil engineering construction projects. This study, therefore, is dedicated to introducing two novel hybrids of neural computing, namely Shuffled Complex Evolution (SCE) and Teaching-Learning-Based Optimization (TLBO) for predicting the CSC. The algorithms are applied to a Multi-Layer Perceptron (MLP) network to create the SCE-MLP and TLBO-MLP ensembles. The results revealed that, first, intelligent models can properly handle analyzing and generalizing the non-linear relationship between the CSC and its influential parameters. For example, the smallest and largest values of the CSC were 17.19 and 58.53 MPa, and the outputs of the MLP, SCE-MLP, and TLBO-MLP range in [17.61, 54.36], [17.69, 55.55] and [18.07, 53.83], respectively. Second, applying the SCE and TLBO optimizers resulted in increasing the correlation of the MLP products from 93.58 to 97.32 and 97.22%, respectively. The prediction error was also reduced by around 34 and 31% which indicates the high efficiency of these algorithms. Moreover, regarding the computation time needed to implement the SCE-MLP and TLBO-MLP models, the SCE is a considerably more time-efficient optimizer. Nevertheless, both suggested models can be promising substitutes for laboratory and destructive CSC evaluative models.
https://doi.org/10.12989/sss.2020.26.6.753 인용 KSCI

Search Result 38, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)