• 제목/요약/키워드: hyperparameter

검색결과 125건 처리시간 0.021초

순환 아키텍쳐 및 하이퍼파라미터 최적화를 이용한 데이터 기반 군사 동작 판별 알고리즘 (A Data-driven Classifier for Motion Detection of Soldiers on the Battlefield using Recurrent Architectures and Hyperparameter Optimization)

  • 김준호;채건주;박재민;박경원
    • 지능정보연구
    • /
    • 제29권1호
    • /
    • pp.107-119
    • /
    • 2023
  • 군인의 동작 및 운동 상태를 인식하는 기술은 웨어러블 테크놀로지와 인공지능의 결합으로 최근 대두되어 병력 관리의 패러다임을 바꿀 기술로 주목받고 있다. 이때 훈련 상황에서의 평가 및 솔루션 제공, 전투 상황에서의 효율적 모니터링 기능을 의도한대로 제공하기 위해서는 상태 판별의 정확도가 매우 높은 수준으로 유지되어야만 한다. 하지만 입력 데이터가 시계열 또는 시퀀스로 주어지는 경우, 기존의 피드포워드 신경망으로는 분류 성능을 극대화하는데 한계가 발생한다. 전장에서의 군사 동작 인식을 위해 다뤄지는 인간의 행동양식 데이터(3축 가속도 및 3축 각속도)는 시의존적 특성의 분석이 요구되기 때문에, 본 논문은 순환 신경망인 LSTM(Long-short Term Memory) 네트워크를 활용하여 취득 데이터의 이동 양상 및 순서 의존성을 파악하고 여덟 가지의 대표적 군사 동작(Sitting, Standing, Walking, Running, Ascending, Descending, Low Crawl, High Crawl)을 분류하는 고성능 인공지능 모델을 제안한다. 이때, 학습 조건 및 모델 변수는 그 정확도에 결정적인 영향을 끼치지만 인간의 수동적 조정이 필요해 비용 비효율적이고 최적의 값을 보장하지 못한다. 본 논문은 기계 스스로 일반화 성능이 극대화된 조건들을 취득할 수 있도록 베이지안 최적화를 활용해 하이퍼파라미터를 최적화한다. 그 결과, 최종 아키텍쳐는 학습 가능한 파라미터의 개수가 유사한 기존의 인공 신경망과 비교해서 오차율이 62.56% 감소할 수 있었으며, 최종적으로 98.39%의 정확도로 군사 동작 인식 기능을 구현할 수 있었다.

순환 신경망 모델을 이용한 소형어선의 운동응답 예측 연구 (Study on the Prediction of Motion Response of Fishing Vessels using Recurrent Neural Networks)

  • 서장훈;박동우;남동
    • 해양환경안전학회지
    • /
    • 제29권5호
    • /
    • pp.505-511
    • /
    • 2023
  • 본 논문에서는 소형어선의 운동 응답을 예측하기 위해 딥러닝 모델을 구축하였다. 크기가 다른 두 소형어선을 대상으로 유체동역학 성능을 평가하여 데이터세트를 확보하였다. 딥러닝 모델은 순환 신경망 기법의 하나인 장단기 메모리 기법(LSTM, Long Short-Term Memory)을 사용하였다. 딥러닝 모델의 입력 데이터는 6 자유도 운동 및 파고의 시계열 데이터를 사용하였으며, 출력 라벨로는 6 자유도 운동의 시계열 데이터로 선정하였다. 최적 LSTM 모델 구축을 위해 hyperparameter 및 입력창 길이의 영향을 평가하였다. 구축된 LSTM 모델을 통해 입사파 방향에 따른 시계열 운동 응답을 예측하였다. 예측된 시계열 운동 응답은 해석 결과와 전반적으로 잘 일치함을 확인할 수 있었다. 시계열의 길이가 길어짐에 따라서 예측값과 해석 결과의 차이가 발생하는데, 이는 장기 데이터에 따른 훈련 영향도가 감소 됨에 따라 나타난 것으로 확인할 수 있다. 전체 예측 데이터의 오차는 약 85% 이상의 데이터가 10% 이내의 오차를 보였으며, 소형어선의 시계열 운동 응답을 잘 예측함을 확인하였다. 구축된 LSTM 모델은 소형어선의 모니터링 및 경보 시스템에 활용될 수 있을 것으로 기대한다.

토양에 살포된 축산 분뇨로부터 암모니아 방출량 예측을 위한 인공신경망의 초매개변수 최적화와 데이터 증식 (Hyperparameter Optimization and Data Augmentation of Artificial Neural Networks for Prediction of Ammonia Emission Amount from Field-applied Manure)

  • 정평곤;임영일
    • Korean Chemical Engineering Research
    • /
    • 제61권1호
    • /
    • pp.123-141
    • /
    • 2023
  • 인공신경망을 이용한 모델 개발에서 데이터의 품질은 모델 성능에 큰 영향을 주고, 양질의 충분한 데이터가 인공신경망 훈련을 위해 필요하다. 하지만, 공학 분야에서는 적은 양의 데이터로 모델을 개발해야 하는 경우가 자주 발생한다. 본 논문은 토양에 살포된 축산 분뇨로부터 암모니아 방출량에 대한 적은 수의 데이터(83 개)를 사용하여 인공신경망 모델의 예측 성능을 향상할 수 있는 방안을 제시하였다. Michaelis-Menten 식으로 표현되는 암모니아 방출량 문제는 11개 입력변수에 대하여 2개 출력변수로 구성되었다. 출력변수는 최대 질소 발생량(Nmax, kg/ha)과 Nmax의 절반에 도달하는 시간(Km, h) 이다. 범주형 입력변수에 대해 다차원 등간격 기법인 one-hot encoding 을 이용하여 데이터 전처리를 수행하였고, 훈련데이터 66개에 대하여 generative adversarial network (GAN)을 이용하여 13개 데이터를 추가로 보강하였다. 또한, 인공신경망의 초매개변수인 은닉층 수, 각 은닉층 내 뉴런 수, 활성화 함수의 최적 조합을 찾기 위하여 Gaussian process (GP)를 사용하였다. 기존의 인공신경망 구조(Lim et al., 2007) 는 17개 평가데이터에 대하여 mean absolute error (MAE)는 Km에서 0.0668, Nmax에서 0.1860이었다. 본 연구에서 제시된 인공신경망 모델은 Km에서 0.0414, Nmax에서 0.0818로 MAE 가 기존 모델 대비 각각 38%, 56% 감소하였다. 본 연구에서 제시된 방법은 적은 양의 데이터를 갖는 문제에서 인공신경망 성능을 향상하기 위하여 활용할 수 있을 것이다.

Weighted Fast Adaptation Prior on Meta-Learning

  • Widhianingsih, Tintrim Dwi Ary;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • 제8권4호
    • /
    • pp.68-74
    • /
    • 2019
  • Along with the deeper architecture in the deep learning approaches, the need for the data becomes very big. In the real problem, to get huge data in some disciplines is very costly. Therefore, learning on limited data in the recent years turns to be a very appealing area. Meta-learning offers a new perspective to learn a model with this limitation. A state-of-the-art model that is made using a meta-learning framework, Meta-SGD, is proposed with a key idea of learning a hyperparameter or a learning rate of the fast adaptation stage in the outer update. However, this learning rate usually is set to be very small. In consequence, the objective function of SGD will give a little improvement to our weight parameters. In other words, the prior is being a key value of getting a good adaptation. As a goal of meta-learning approaches, learning using a single gradient step in the inner update may lead to a bad performance. Especially if the prior that we use is far from the expected one, or it works in the opposite way that it is very effective to adapt the model. By this reason, we propose to add a weight term to decrease, or increase in some conditions, the effect of this prior. The experiment on few-shot learning shows that emphasizing or weakening the prior can give better performance than using its original value.

Prediction of Significant Wave Height in Korea Strait Using Machine Learning

  • Park, Sung Boo;Shin, Seong Yun;Jung, Kwang Hyo;Lee, Byung Gook
    • 한국해양공학회지
    • /
    • 제35권5호
    • /
    • pp.336-346
    • /
    • 2021
  • The prediction of wave conditions is crucial in the field of marine and ocean engineering. Hence, this study aims to predict the significant wave height through machine learning (ML), a soft computing method. The adopted metocean data, collected from 2012 to 2020, were obtained from the Korea Institute of Ocean Science and Technology. We adopted the feedforward neural network (FNN) and long-short term memory (LSTM) models to predict significant wave height. Input parameters for the input layer were selected by Pearson correlation coefficients. To obtain the optimized hyperparameter, we conducted a sensitivity study on the window size, node, layer, and activation function. Finally, the significant wave height was predicted using the FNN and LSTM models, by varying the three input parameters and three window sizes. Accordingly, FNN (W48) (i.e., FNN with window size 48) and LSTM (W48) (i.e., LSTM with window size 48) were superior outcomes. The most suitable model for predicting the significant wave height was FNN(W48) owing to its accuracy and calculation time. If the metocean data were further accumulated, the accuracy of the ML model would have improved, and it will be beneficial to predict added resistance by waves when conducting a sea trial test.

Application of deep neural networks for high-dimensional large BWR core neutronics

  • Abu Saleem, Rabie;Radaideh, Majdi I.;Kozlowski, Tomasz
    • Nuclear Engineering and Technology
    • /
    • 제52권12호
    • /
    • pp.2709-2716
    • /
    • 2020
  • Compositions of large nuclear cores (e.g. boiling water reactors) are highly heterogeneous in terms of fuel composition, control rod insertions and flow regimes. For this reason, they usually lack high order of symmetry (e.g. 1/4, 1/8) making it difficult to estimate their neutronic parameters for large spaces of possible loading patterns. A detailed hyperparameter optimization technique (a combination of manual and Gaussian process search) is used to train and optimize deep neural networks for the prediction of three neutronic parameters for the Ringhals-1 BWR unit: power peaking factors (PPF), control rod bank level, and cycle length. Simulation data is generated based on half-symmetry using PARCS core simulator by shuffling a total of 196 assemblies. The results demonstrate a promising performance by the deep networks as acceptable mean absolute error values are found for the global maximum PPF (~0.2) and for the radially and axially averaged PPF (~0.05). The mean difference between targets and predictions for the control rod level is about 5% insertion depth. Lastly, cycle length labels are predicted with 82% accuracy. The results also demonstrate that 10,000 samples are adequate to capture about 80% of the high-dimensional space, with minor improvements found for larger number of samples. The promising findings of this work prove the ability of deep neural networks to resolve high dimensionality issues of large cores in the nuclear area.

Selecting the Optimal Hidden Layer of Extreme Learning Machine Using Multiple Kernel Learning

  • Zhao, Wentao;Li, Pan;Liu, Qiang;Liu, Dan;Liu, Xinwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권12호
    • /
    • pp.5765-5781
    • /
    • 2018
  • Extreme learning machine (ELM) is emerging as a powerful machine learning method in a variety of application scenarios due to its promising advantages of high accuracy, fast learning speed and easy of implementation. However, how to select the optimal hidden layer of ELM is still an open question in the ELM community. Basically, the number of hidden layer nodes is a sensitive hyperparameter that significantly affects the performance of ELM. To address this challenging problem, we propose to adopt multiple kernel learning (MKL) to design a multi-hidden-layer-kernel ELM (MHLK-ELM). Specifically, we first integrate kernel functions with random feature mapping of ELM to design a hidden-layer-kernel ELM (HLK-ELM), which serves as the base of MHLK-ELM. Then, we utilize the MKL method to propose two versions of MHLK-ELMs, called sparse and non-sparse MHLK-ELMs. Both two types of MHLK-ELMs can effectively find out the optimal linear combination of multiple HLK-ELMs for different classification and regression problems. Experimental results on seven data sets, among which three data sets are relevant to classification and four ones are relevant to regression, demonstrate that the proposed MHLK-ELM achieves superior performance compared with conventional ELM and basic HLK-ELM.

대청호 Chl-a 예측을 위한 random forest와 gradient boosting 알고리즘 적용 연구 (A study on applying random forest and gradient boosting algorithm for Chl-a prediction of Daecheong lake)

  • 이상민;김일규
    • 상하수도학회지
    • /
    • 제35권6호
    • /
    • pp.507-516
    • /
    • 2021
  • In this study, the machine learning which has been widely used in prediction algorithms recently was used. the research point was the CD(chudong) point which was a representative point of Daecheong Lake. Chlorophyll-a(Chl-a) concentration was used as a target variable for algae prediction. to predict the Chl-a concentration, a data set of water quality and quantity factors was consisted. we performed algorithms about random forest and gradient boosting with Python. to perform the algorithms, at first the correlation analysis between Chl-a and water quality and quantity data was studied. we extracted ten factors of high importance for water quality and quantity data. as a result of the algorithm performance index, the gradient boosting showed that RMSE was 2.72 mg/m3 and MSE was 7.40 mg/m3 and R2 was 0.66. as a result of the residual analysis, the analysis result of gradient boosting was excellent. as a result of the algorithm execution, the gradient boosting algorithm was excellent. the gradient boosting algorithm was also excellent with 2.44 mg/m3 of RMSE in the machine learning hyperparameter adjustment result.

상업용 리튬 배터리의 수명 예측을 위한 고속대량충방전 데이터 정규화 선형회귀모델의 적용 (Application of Regularized Linear Regression Models Using Public Domain data for Cycle Life Prediction of Commercial Lithium-Ion Batteries)

  • 김장군;이종숙
    • 한국수소및신에너지학회논문집
    • /
    • 제32권6호
    • /
    • pp.592-611
    • /
    • 2021
  • In this study a rarely available high-throughput cycling data set of 124 commercial lithium iron phosphate/graphite cells cycled under fast-charging conditions, with widely varying cycle lives ranging from 150 to 2,300 cycles including in-cycle temperature and per-cycle IR measurements. We worked out own Python codes which reproduced the various data plots and machine learning approaches for cycle life prediction using early cycles and more details not presented in the article and the supplementary information. Particularly, we applied regularized ridge, lasso and elastic net linear regression models using features extracted from capacity fade curves, discharge voltage curves, and other data such as internal resistance and cell can temperature. We found that due to the limitation in the quantity and quality of the data from costly and lengthy battery testing a careful hyperparameter tuning may be required and that model features need to be extracted based on the domain knowledge.

Mathematical modeling of the impact of Omicron variant on the COVID-19 situation in South Korea

  • Oh, Jooha;Apio, Catherine;Park, Taesung
    • Genomics & Informatics
    • /
    • 제20권2호
    • /
    • pp.22.1-22.9
    • /
    • 2022
  • The rise of newer coronavirus disease 2019 (COVID-19) variants has brought a challenge to ending the spread of COVID-19. The variants have a different fatality, morbidity, and transmission rates and affect vaccine efficacy differently. Therefore, the impact of each new variant on the spread of COVID-19 is of interest to governments and scientists. Here, we proposed mathematical SEIQRDVP and SEIQRDV3P models to predict the impact of the Omicron variant on the spread of the COVID-19 situation in South Korea. SEIQEDVP considers one vaccine level at a time while SEIQRDV3P considers three vaccination levels (only one dose received, full doses received, and full doses + booster shots received) simultaneously. The omicron variant's effect was contemplated as a weighted sum of the delta and omicron variants' transmission rate and tuned using a hyperparameter k. Our models' performances were compared with common models like SEIR, SEIQR, and SEIQRDVUP using the root mean square error (RMSE). SEIQRDV3P performed better than the SEIQRDVP model. Without consideration of the variant effect, we don't see a rapid rise in COVID-19 cases and high RMSE values. But, with consideration of the omicron variant, we predicted a continuous rapid rise in COVID-19 cases until maybe herd immunity is developed in the population. Also, the RMSE value for the SEIQRDV3P model decreased by 27.4%. Therefore, modeling the impact of any new risen variant is crucial in determining the trajectory of the spread of COVID-19 and determining policies to be implemented.