• Title/Summary/Keyword: Neural network hyper-parameter

Search Result 27, Processing Time 0.02 seconds

Effects of Hyper-parameters and Dataset on CNN Training

  • Nguyen, Huu Nhan;Lee, Chanho
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.14-20
    • /
    • 2018
  • The purpose of training a convolutional neural network (CNN) is to obtain weight factors that give high classification accuracies. The initial values of hyper-parameters affect the training results, and it is important to train a CNN with a suitable hyper-parameter set of a learning rate, a batch size, the initialization of weight factors, and an optimizer. We investigate the effects of a single hyper-parameter while others are fixed in order to obtain a hyper-parameter set that gives higher classification accuracies and requires shorter training time using a proposed VGG-like CNN for training since the VGG is widely used. The CNN is trained for four datasets of CIFAR10, CIFAR100, GTSRB and DSDL-DB. The effects of the normalization and the data transformation for datasets are also investigated, and a training scheme using merged datasets is proposed.

Graph Convolutional - Network Architecture Search : Network architecture search Using Graph Convolution Neural Networks (그래프 합성곱-신경망 구조 탐색 : 그래프 합성곱 신경망을 이용한 신경망 구조 탐색)

  • Su-Youn Choi;Jong-Youel Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.649-654
    • /
    • 2023
  • This paper proposes the design of a neural network structure search model using graph convolutional neural networks. Deep learning has a problem of not being able to verify whether the designed model has a structure with optimized performance due to the nature of learning as a black box. The neural network structure search model is composed of a recurrent neural network that creates a model and a convolutional neural network that is the generated network. Conventional neural network structure search models use recurrent neural networks, but in this paper, we propose GC-NAS, which uses graph convolutional neural networks instead of recurrent neural networks to create convolutional neural network models. The proposed GC-NAS uses the Layer Extraction Block to explore depth, and the Hyper Parameter Prediction Block to explore spatial and temporal information (hyper parameters) based on depth information in parallel. Therefore, since the depth information is reflected, the search area is wider, and the purpose of the search area of the model is clear by conducting a parallel search with depth information, so it is judged to be superior in theoretical structure compared to GC-NAS. GC-NAS is expected to solve the problem of the high-dimensional time axis and the range of spatial search of recurrent neural networks in the existing neural network structure search model through the graph convolutional neural network block and graph generation algorithm. In addition, we hope that the GC-NAS proposed in this paper will serve as an opportunity for active research on the application of graph convolutional neural networks to neural network structure search.

Hyper Parameter Tuning Method based on Sampling for Optimal LSTM Model

  • Kim, Hyemee;Jeong, Ryeji;Bae, Hyerim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.1
    • /
    • pp.137-143
    • /
    • 2019
  • As the performance of computers increases, the use of deep learning, which has faced technical limitations in the past, is becoming more diverse. In many fields, deep learning has contributed to the creation of added value and used on the bases of more data as the application become more divers. The process for obtaining a better performance model will require a longer time than before, and therefore it will be necessary to find an optimal model that shows the best performance more quickly. In the artificial neural network modeling a tuning process that changes various elements of the neural network model is used to improve the model performance. Except Gride Search and Manual Search, which are widely used as tuning methods, most methodologies have been developed focusing on heuristic algorithms. The heuristic algorithm can get the results in a short time, but the results are likely to be the local optimal solution. Obtaining a global optimal solution eliminates the possibility of a local optimal solution. Although the Brute Force Method is commonly used to find the global optimal solution, it is not applicable because of an infinite number of hyper parameter combinations. In this paper, we use a statistical technique to reduce the number of possible cases, so that we can find the global optimal solution.

Comparative Analysis of PM10 Prediction Performance between Neural Network Models

  • Jung, Yong-Jin;Oh, Chang-Heon
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.4
    • /
    • pp.241-247
    • /
    • 2021
  • Particulate matter has emerged as a serious global problem, necessitating highly reliable information on the matter. Therefore, various algorithms have been used in studies to predict particulate matter. In this study, we compared the prediction performance of neural network models that have been actively studied for particulate matter prediction. Among the neural network algorithms, a deep neural network (DNN), a recurrent neural network, and long short-term memory were used to design the optimal prediction model using a hyper-parameter search. In the comparative analysis of the prediction performance of each model, the DNN model showed a lower root mean square error (RMSE) than the other algorithms in the performance comparison using the RMSE and the level of accuracy as metrics for evaluation. The stability of the recurrent neural network was slightly lower than that of the other algorithms, although the accuracy was higher.

A Study on the Prediction of Optimized Injection Molding Condition using Artificial Neural Network (ANN) (인공신경망을 활용한 최적 사출성형조건 예측에 관한 연구)

  • Yang, D.C.;Lee, J.H.;Yoon, K.H.;Kim, J.S.
    • Transactions of Materials Processing
    • /
    • v.29 no.4
    • /
    • pp.218-228
    • /
    • 2020
  • The prediction of final mass and optimized process conditions of injection molded products using Artificial Neural Network (ANN) were demonstrated. The ANN was modeled with 10 input parameters and one output parameter (mass). The input parameters, i.e.; melt temperature, mold temperature, injection speed, packing pressure, packing time, cooling time, back pressure, plastification speed, V/P switchover, and suck back were selected. To generate training data for the ANN model, 77 experiments based on the combination of orthogonal sampling and random sampling were performed. The collected training data were normalized to eliminate scale differences between factors to improve the prediction performance of the ANN model. Grid search and random search method were used to find the optimized hyper-parameter of the ANN model. After the training of ANN model, optimized process conditions that satisfied the target mass of 41.14 g were predicted. The predicted process conditions were verified through actual injection molding experiments. Through the verification, it was found that the average deviation in the optimized conditions was 0.15±0.07 g. This value confirms that our proposed procedure can successfully predict the optimized process conditions for the target mass of injection molded products.

Prediction of rebound in shotcrete using deep bi-directional LSTM

  • Suzen, Ahmet A.;Cakiroglu, Melda A.
    • Computers and Concrete
    • /
    • v.24 no.6
    • /
    • pp.555-560
    • /
    • 2019
  • During the application of shotcrete, a part of the concrete bounces back after hitting to the surface, the reinforcement or previously sprayed concrete. This rebound material is definitely not added to the mixture and considered as waste. In this study, a deep neural network model was developed to predict the rebound material during shotcrete application. The factors affecting rebound and the datasets of these parameters were obtained from previous experiments. The Long Short-Term Memory (LSTM) architecture of the proposed deep neural network model was used in accordance with this data set. In the development of the proposed four-tier prediction model, the dataset was divided into 90% training and 10% test. The deep neural network was modeled with 11 dependents 1 independent data by determining the most appropriate hyper parameter values for prediction. Accuracy and error performance in success performance of LSTM model were evaluated over MSE and RMSE. A success of 93.2% was achieved at the end of training of the model and a success of 85.6% in the test. There was a difference of 7.6% between training and test. In the following stage, it is aimed to increase the success rate of the model by increasing the number of data in the data set with synthetic and experimental data. In addition, it is thought that prediction of the amount of rebound during dry-mix shotcrete application will provide economic gain as well as contributing to environmental protection.

Application of Convolution Neural Network to Flare Forecasting using solar full disk images

  • Yi, Kangwoo;Moon, Yong-Jae;Park, Eunsu;Shin, Seulki
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.42 no.2
    • /
    • pp.60.1-60.1
    • /
    • 2017
  • In this study we apply Convolution Neural Network(CNN) to solar flare occurrence prediction with various parameter options using the 00:00 UT MDI images from 1996 to 2010 (total 4962 images). We assume that only X, M and C class flares correspond to "flare occurrence" and the others to "non-flare". We have attempted to look for the best options for the models with two CNN pre-trained models (AlexNet and GoogLeNet), by modifying training images and changing hyper parameters. Our major results from this study are as follows. First, the flare occurrence predictions are relatively good with about 80 % accuracies. Second, both flare prediction models based on AlexNet and GoogLeNet have similar results but AlexNet is faster than GoogLeNet. Third, modifying the training images to reduce the projection effect is not effective. Fourth, skill scores of our flare occurrence model are mostly better than those of the previous models.

  • PDF

A Comparative Analysis of the Forecasting Performance of Coal and Iron Ore in Gwangyang Port Using Stepwise Regression and Artificial Neural Network Model (단계적 회귀분석과 인공신경망 모형을 이용한 광양항 석탄·철광석 물동량 예측력 비교 분석)

  • Cho, Sang-Ho;Nam, Hyung-Sik;Ryu, Ki-Jin;Ryoo, Dong-Keun
    • Journal of Navigation and Port Research
    • /
    • v.44 no.3
    • /
    • pp.187-194
    • /
    • 2020
  • It is very important to forecast freight volume accurately to establish major port policies and future operation plans. Thus, related studies are being conducted because of this importance. In this paper, stepwise regression analysis and artificial neural network model were analyzed to compare the predictive power of each model on Gwangyang Port, the largest domestic port for coal and iron ore transportation. Data of a total of 121 months J anuary 2009-J anuary 2019 were used. Factors affecting coal and iron ore trade volume were selected and classified into supply-related factors and market/economy-related factors. In the stepwise regression analysis, the tonnage of ships entering the port, coal price, and dollar exchange rate were selected as the final variables in case of the Gwangyang Port coal volume forecasting model. In the iron ore volume forecasting model, the tonnage of ships entering the port and the price of iron ore were selected as the final variables. In the analysis using the artificial neural network model, trial-and-error method that various Hyper-parameters affecting the performance of the model were selected to identify the most optimal model used. The analysis results showed that the artificial neural network model had better predictive performance than the stepwise regression analysis. The model which showed the most excellent performance was the Gwangyang Port Coal Volume Forecasting Artificial Neural Network Model. In comparing forecasted values by various predictive models and actually measured values, the artificial neural network model showed closer values to the actual highest point and the lowest point than the stepwise regression analysis.

Recent Research & Development Trends in Automated Machine Learning (자동 기계학습(AutoML) 기술 동향)

  • Moon, Y.H.;Shin, I.H.;Lee, Y.J.;Min, O.G.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.4
    • /
    • pp.32-42
    • /
    • 2019
  • The performance of machine learning algorithms significantly depends on how a configuration of hyperparameters is identified and how a neural network architecture is designed. However, this requires expert knowledge of relevant task domains and a prohibitive computation time. To optimize these two processes using minimal effort, many studies have investigated automated machine learning in recent years. This paper reviews the conventional random, grid, and Bayesian methods for hyperparameter optimization (HPO) and addresses its recent approaches, which speeds up the identification of the best set of hyperparameters. We further investigate existing neural architecture search (NAS) techniques based on evolutionary algorithms, reinforcement learning, and gradient derivatives and analyze their theoretical characteristics and performance results. Moreover, future research directions and challenges in HPO and NAS are described.

A Study on the Prediction of Mass and Length of Injection-molded Product Using Artificial Neural Network (인공신경망을 활용한 사출성형품의 질량과 치수 예측에 관한 연구)

  • Yang, Dong-Cheol;Lee, Jun-Han;Kim, Jong-Sun
    • Design & Manufacturing
    • /
    • v.14 no.3
    • /
    • pp.1-7
    • /
    • 2020
  • This paper predicts the mass and the length of injection-molded products through the Artificial Neural Network (ANN) method. The ANN was implemented with 5 input parameters and 2 output parameters(mass, length). The input parameters, such as injection time, melt temperature, mold temperature, packing pressure and packing time were selected. 44 experiments that are based on the mixed sampling method were performed to generate training data for the ANN model. The generated training data were normalized to eliminate scale differences between factors to improve the prediction performance of the ANN model. A random search method was used to find the optimized hyper-parameter of the ANN model. After the ANN completed the training, the ANN model predicted the mass and the length of the injection-molded product. According to the result, average error of the ANN for mass was 0.3 %. In the case of length, the average deviation of ANN was 0.043 mm.