• Title/Summary/Keyword: Neural network hyper-parameter

Search Result 27, Processing Time 0.035 seconds

(Searching Effective Network Parameters to Construct Convolutional Neural Networks for Object Detection) (물체 검출 컨벌루션 신경망 설계를 위한 효과적인 네트워크 파라미터 추출)

  • Kim, Nuri;Lee, Donghoon;Oh, Songhwai
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.668-673
    • /
    • 2017
  • Deep neural networks have shown remarkable performance in various fields of pattern recognition such as voice recognition, image recognition and object detection. However, underlying mechanisms of the network have not been fully revealed. In this paper, we focused on empirical analysis of the network parameters. The Faster R-CNN(region-based convolutional neural network) was used as a baseline network of our work and three important parameters were analyzed: the dropout ratio which prevents the overfitting of the neural network, the size of the anchor boxes and the activation function. We also compared the performance of dropout and batch normalization. The network performed favorably when the dropout ratio was 0.3 and the size of the anchor box had not shown notable relation to the performance of the network. The result showed that batch normalization can't entirely substitute the dropout method. The used leaky ReLU(rectified linear unit) with a negative domain slope of 0.02 showed comparably good performance.

Compressive strength estimation of concrete containing zeolite and diatomite: An expert system implementation

  • Ozcan, Giyasettin;Kocak, Yilmaz;Gulbandilar, Eyyup
    • Computers and Concrete
    • /
    • v.21 no.1
    • /
    • pp.21-30
    • /
    • 2018
  • In this study, we analyze the behavior of concrete which contains zeolite and diatomite. In order to achieve the goal, we utilize expert system methods. The utilized methods are artificial neural network and adaptive network-based fuzzy inference systems. In this respect, we exploit seven different mixes of concrete. The concrete mixes contain zeolite, diatomite, mixture of zeolite and diatomite. All seven concrete mixes are exposed to 28, 56 and 90 days' compressive strength experiments with 63 specimens. The results of the compressive strength experiments are used as input data during the training and testing of expert system methods. In terms of artificial neural network and adaptive network-based fuzzy models, data format comprises seven input parameters, which are; the age of samples (days), amount of Portland cement, zeolite, diatomite, aggregate, water and hyper plasticizer. On the other hand, the output parameter is defined as the compressive strength of concrete. In the models, training and testing results have concluded that both expert system model yield thrilling medium to predict the compressive strength of concrete containing zeolite and diatomite.

Feasibility Study of Google's Teachable Machine in Diagnosis of Tooth-Marked Tongue

  • Jeong, Hyunja
    • Journal of dental hygiene science
    • /
    • v.20 no.4
    • /
    • pp.206-212
    • /
    • 2020
  • Background: A Teachable Machine is a kind of machine learning web-based tool for general persons. In this paper, the feasibility of Google's Teachable Machine (ver. 2.0) was studied in the diagnosis of the tooth-marked tongue. Methods: For machine learning of tooth-marked tongue diagnosis, a total of 1,250 tongue images were used on Kaggle's web site. Ninety percent of the images were used for the training data set, and the remaining 10% were used for the test data set. Using Google's Teachable Machine (ver. 2.0), machine learning was performed using separated images. To optimize the machine learning parameters, I measured the diagnosis accuracies according to the value of epoch, batch size, and learning rate. After hyper-parameter tuning, the ROC (receiver operating characteristic) analysis method determined the sensitivity (true positive rate, TPR) and specificity (false positive rate, FPR) of the machine learning model to diagnose the tooth-marked tongue. Results: To evaluate the usefulness of the Teachable Machine in clinical application, I used 634 tooth-marked tongue images and 491 no-marked tongue images for machine learning. When the epoch, batch size, and learning rate as hyper-parameters were 75, 0.0001, and 128, respectively, the accuracy of the tooth-marked tongue's diagnosis was best. The accuracies for the tooth-marked tongue and the no-marked tongue were 92.1% and 72.6%, respectively. And, the sensitivity (TPR) and specificity (FPR) were 0.92 and 0.28, respectively. Conclusion: These results are more accurate than Li's experimental results calculated with convolution neural network. Google's Teachable Machines show good performance by hyper-parameters tuning in the diagnosis of the tooth-marked tongue. We confirmed that the tool is useful for several clinical applications.

Performance Evaluation of Machine Learning and Deep Learning Algorithms in Crop Classification: Impact of Hyper-parameters and Training Sample Size (작물분류에서 기계학습 및 딥러닝 알고리즘의 분류 성능 평가: 하이퍼파라미터와 훈련자료 크기의 영향 분석)

  • Kim, Yeseul;Kwak, Geun-Ho;Lee, Kyung-Do;Na, Sang-Il;Park, Chan-Won;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.5
    • /
    • pp.811-827
    • /
    • 2018
  • The purpose of this study is to compare machine learning algorithm and deep learning algorithm in crop classification using multi-temporal remote sensing data. For this, impacts of machine learning and deep learning algorithms on (a) hyper-parameter and (2) training sample size were compared and analyzed for Haenam-gun, Korea and Illinois State, USA. In the comparison experiment, support vector machine (SVM) was applied as machine learning algorithm and convolutional neural network (CNN) was applied as deep learning algorithm. In particular, 2D-CNN considering 2-dimensional spatial information and 3D-CNN with extended time dimension from 2D-CNN were applied as CNN. As a result of the experiment, it was found that the hyper-parameter values of CNN, considering various hyper-parameter, defined in the two study areas were similar compared with SVM. Based on this result, although it takes much time to optimize the model in CNN, it is considered that it is possible to apply transfer learning that can extend optimized CNN model to other regions. Then, in the experiment results with various training sample size, the impact of that on CNN was larger than SVM. In particular, this impact was exaggerated in Illinois State with heterogeneous spatial patterns. In addition, the lowest classification performance of 3D-CNN was presented in Illinois State, which is considered to be due to over-fitting as complexity of the model. That is, the classification performance was relatively degraded due to heterogeneous patterns and noise effect of input data, although the training accuracy of 3D-CNN model was high. This result simply that a proper classification algorithms should be selected considering spatial characteristics of study areas. Also, a large amount of training samples is necessary to guarantee higher classification performance in CNN, particularly in 3D-CNN.

Predicting blast-induced ground vibrations at limestone quarry from artificial neural network optimized by randomized and grid search cross-validation, and comparative analyses with blast vibration predictor models

  • Salman Ihsan;Shahab Saqib;Hafiz Muhammad Awais Rashid;Fawad S. Niazi;Mohsin Usman Qureshi
    • Geomechanics and Engineering
    • /
    • v.35 no.2
    • /
    • pp.121-133
    • /
    • 2023
  • The demand for cement and limestone crushed materials has increased many folds due to the tremendous increase in construction activities in Pakistan during the past few decades. The number of cement production industries has increased correspondingly, and so the rock-blasting operations at the limestone quarry sites. However, the safety procedures warranted at these sites for the blast-induced ground vibrations (BIGV) have not been adequately developed and/or implemented. Proper prediction and monitoring of BIGV are necessary to ensure the safety of structures in the vicinity of these quarry sites. In this paper, an attempt has been made to predict BIGV using artificial neural network (ANN) at three selected limestone quarries of Pakistan. The ANN has been developed in Python using Keras with sequential model and dense layers. The hyper parameters and neurons in each of the activation layers has been optimized using randomized and grid search method. The input parameters for the model include distance, a maximum charge per delay (MCPD), depth of hole, burden, spacing, and number of blast holes, whereas, peak particle velocity (PPV) is taken as the only output parameter. A total of 110 blast vibrations datasets were recorded from three different limestone quarries. The dataset has been divided into 85% for neural network training, and 15% for testing of the network. A five-layer ANN is trained with Rectified Linear Unit (ReLU) activation function, Adam optimization algorithm with a learning rate of 0.001, and batch size of 32 with the topology of 6-32-32-256-1. The blast datasets were utilized to compare the performance of ANN, multivariate regression analysis (MVRA), and empirical predictors. The performance was evaluated using the coefficient of determination (R2), mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE), and root mean squared error (RMSE)for predicted and measured PPV. To determine the relative influence of each parameter on the PPV, sensitivity analyses were performed for all input parameters. The analyses reveal that ANN performs superior than MVRA and other empirical predictors, andthat83% PPV is affected by distance and MCPD while hole depth, number of blast holes, burden and spacing contribute for the remaining 17%. This research provides valuable insights into improving safety measures and ensuring the structural integrity of buildings near limestone quarry sites.

A Study on the Performance Improvement of Anomaly-Based IDS Through the Improvement of Training Data (학습 데이터 개선을 통한 Anomaly-based IDS의 성능 향상 방안)

  • Moon, Sang Tae;Lee, Soo Jin
    • Convergence Security Journal
    • /
    • v.19 no.4
    • /
    • pp.181-188
    • /
    • 2019
  • Recently, attempts to apply artificial intelligence technology to create the normal profile in Anomaly-based intrusion detection systems have been made actively. But existing studies that proposed the application of artificial intelligence technology mostly focus on improving the structure of artificial neural networks and finding optimal hyper-parameter values, and fail to address various problems that may arise from the misconfiguration of learning data. In this paper, we identify the main problems that may arise due to the misconfiguration of learning data through experiment. And we also propose a novel approach that can address such problems and improve the detection performance through reconstruction of learning data.

A Study on Peak Load Prediction Using TCN Deep Learning Model (TCN 딥러닝 모델을 이용한 최대전력 예측에 관한 연구)

  • Lee Jung Il
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.251-258
    • /
    • 2023
  • It is necessary to predict peak load accurately in order to supply electric power and operate the power system stably. Especially, it is more important to predict peak load accurately in winter and summer because peak load is higher than other seasons. If peak load is predicted to be higher than actual peak load, the start-up costs of power plants would increase. It causes economic loss to the company. On the other hand, if the peak load is predicted to be lower than the actual peak load, blackout may occur due to a lack of power plants capable of generating electricity. Economic losses and blackouts can be prevented by minimizing the prediction error of the peak load. In this paper, the latest deep learning model such as TCN is used to minimize the prediction error of peak load. Even if the same deep learning model is used, there is a difference in performance depending on the hyper-parameters. So, I propose methods for optimizing hyper-parameters of TCN for predicting the peak load. Data from 2006 to 2021 were input into the model and trained, and prediction error was tested using data in 2022. It was confirmed that the performance of the deep learning model optimized by the methods proposed in this study is superior to other deep learning models.

CNN-based In-loop Filter on TU Block (TU 블록 크기에 따른 CNN기반 인루프필터)

  • Kim, Yang-Woo;Jeong, Seyoon;Cho, Seunghyun;Lee, Yung-Lyul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.15-17
    • /
    • 2018
  • VVC(Versatile Video Coding)는 입력된 영상을 CTU(Coding Tree Unit) 단위로 분할하여 코딩하며, 이를 다시 QTBTT(Quadtree plus binary tree and triple tree)로 분할하고, TU(Transform Unit)도 이와 같은 단위로 분할된다. 따라서 TU의 크기는 $4{\times}4$, $4{\times}8$, $4{\times}16$, $4{\times}32$, $8{\times}4$, $16{\times}4$, $32{\times}4$, $8{\times}8$, $8{\times}16$, $8{\times}32$, $16{\times}8$, $32{\times}8$, $16{\times}16$, $16{\times}32$, $32{\times}16$, $32{\times}32$, $64{\times}64$의 17가지 종류가 있다. 기존의 VVC 참조 Software인 VTM에서는 디블록킹필터와 SAO(Sample Adaptive Offset)로 이루어진 인루프필터를 이용하여 에러를 복원하는데, 본 논문은 TU 크기에 따라서 원본블록과 복원블록의 차이(에러)가 통계적으로 다름을 이용하여 서로 다른 CNN(Convolution Neural Network)을 구축하고 에러를 복원하는 방법으로 VTM의 인루프 필터를 대체한다. 복원영상의 에러를 감소시키기 위하여 TU 블록크기에 따라 DenseNet의 Dense Block기반 CNN을 구성하고, Hyper Parameter와 복잡도의 감소를 위해 네트워크 간에 일부 가중치를 공유하는 모양의 Network를 구성하였다.

  • PDF

Improved Performance of Image Semantic Segmentation using NASNet (NASNet을 이용한 이미지 시맨틱 분할 성능 개선)

  • Kim, Hyoung Seok;Yoo, Kee-Youn;Kim, Lae Hyun
    • Korean Chemical Engineering Research
    • /
    • v.57 no.2
    • /
    • pp.274-282
    • /
    • 2019
  • In recent years, big data analysis has been expanded to include automatic control through reinforcement learning as well as prediction through modeling. Research on the utilization of image data is actively carried out in various industrial fields such as chemical, manufacturing, agriculture, and bio-industry. In this paper, we applied NASNet, which is an AutoML reinforced learning algorithm, to DeepU-Net neural network that modified U-Net to improve image semantic segmentation performance. We used BRATS2015 MRI data for performance verification. Simulation results show that DeepU-Net has more performance than the U-Net neural network. In order to improve the image segmentation performance, remove dropouts that are typically applied to neural networks, when the number of kernels and filters obtained through reinforcement learning in DeepU-Net was selected as a hyperparameter of neural network. The results show that the training accuracy is 0.5% and the verification accuracy is 0.3% better than DeepU-Net. The results of this study can be applied to various fields such as MRI brain imaging diagnosis, thermal imaging camera abnormality diagnosis, Nondestructive inspection diagnosis, chemical leakage monitoring, and monitoring forest fire through CCTV.

Development of Prediction Model for Nitrogen Oxides Emission Using Artificial Intelligence (인공지능 기반 질소산화물 배출량 예측을 위한 연구모형 개발)

  • Jo, Ha-Nui;Park, Jisu;Yun, Yongju
    • Korean Chemical Engineering Research
    • /
    • v.58 no.4
    • /
    • pp.588-595
    • /
    • 2020
  • Prediction and control of nitrogen oxides (NOx) emission is of great interest in industry due to stricter environmental regulations. Herein, we propose an artificial intelligence (AI)-based framework for prediction of NOx emission. The framework includes pre-processing of data for training of neural networks and evaluation of the AI-based models. In this work, Long-Short-Term Memory (LSTM), one of the recurrent neural networks, was adopted to reflect the time series characteristics of NOx emissions. A decision tree was used to determine a time window of LSTM prior to training of the network. The neural network was trained with operational data from a heating furnace. The optimal model was obtained by optimizing hyper-parameters. The LSTM model provided a reliable prediction of NOx emission for both training and test data, showing an accuracy of 93% or more. The application of the proposed AI-based framework will provide new opportunities for predicting the emission of various air pollutants with time series characteristics.