• Title/Summary/Keyword: Hyperparameter

Search Result 123, Processing Time 0.03 seconds

A Prediction of Demand for Korean Baseball League using Artificial Neural Network (인공 신경망 모형을 이용한 한국프로야구 관중 수요 예측)

  • Park, Jinuk;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.920-923
    • /
    • 2017
  • 본 연구는 기존의 수요 예측 등의 시계열 분석에서 주로 사용되는 ARIMA 모형의 어려움을 극복하고자 인공신경망(Artificial Neural Network) 모형을 이용하여 한국 프로 야구 관중 수를 예측하였다. 인공신경망의 가장 기본적인 종류인 전방향 신경망(Feedforward Neural Network)의 초모수(Hyperparameter) 선정에 그리드 탐색(Grid Search)을 적용하여 최적의 모형을 찾고자 하였다. 훈련 자료로는 2015년 3월부터 8월까지의 일별 KBO 관중 수 자료를 대상으로 하였고, 예측력 검증을 위해 2015년 9월 관중 수를 예측하여 실제 관측값과 비교하였다. 그 결과, 그리드 탐색법에서 최적 모형이라고 판단한 모형의 예측력은, 평균 절대 백분율 오차(MAPE) 기준으로 평균 27.14% 였다. 또한, 앙상블 기법에서 착안하여 오차율이 낮은 모형 5개의 예측값 평균의 MAPE는 평균 28.58% 였다. 이는 다중회귀와 비교해보았을 때, 평균적으로 각각 14%, 13.6% 높은 예측력을 보이고 있다.

Parameter Estimation of Groundwater Flow in Hillside Slopes Using Bayesian Approach (사면의 지하수 흐름에서 Bayesian 이론을 이용한 매개변수 추정)

  • 이인모;이주공;김영욱
    • Journal of the Korean Geotechnical Society
    • /
    • v.17 no.2
    • /
    • pp.51-57
    • /
    • 2001
  • 지하수위의 상승에 따른 간극수압의 증가는 사면의 불안정을 야기할 수 있다. 그러나 모델링 오차, 계측오차, 모델변수의 불확실성 등과 같은 오차로 인하여 사면에서의 지하수위 변동을 예측하는 것은 매우 어렵다. 이러한 불확실성을 극복하고 지하수위 변동을 평가하기 위한 최적의 모델변수를 구하기 위하여 역해석 기법이 사용되고 있다. 본 논문에서는 사면에서의 지하수위 변동을 예측하기 위하여 포화대에서의 지하수 흐름과 불포화대에서의 지하수 흐름을 동시에 고려할 수 있는 수치해석 모델과 변수예측기법을 적용하였다. 따라서, 본 논문에서는 포화투수계수($K_{s}$ ), 포화흡인력($\psi$$_{e}$) 및 불포화 투수계수의 함수에 사용되는 경험적인 상수(b)를 주요 매개변수로 선정하여 역해석을 실시하였다. 그리고, 역해석 기법 가운데 Maximum Likelihood(MK), Maximum-A-Posterior(MAP) 및 Extended Bayesian Method(EBM)에 대하여 비교연구를 실시하였다. 위의 세가지 방법 가운데 EBM은 가상의 변수(Hyperparameter) $\beta$를 도입함으로써 현장계측치와 사전정보를 가장 잘 조화시키는 방법으로 다른 ML, MAP 보다 탁월한 방법인 것을 알 수 있었다.

  • PDF

Online Adaptation of Continuous Density Hidden Markov Models Based on Speaker Space Model Evolution (화자공간모델 진화에 근거한 연속밀도 은닉 마코프모델의 온라인 적응)

  • Kim Dong Kook;Kim Young Joon;Kim Hyun Woo;Kim Nam Soo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.69-72
    • /
    • 2002
  • 본 논문에서 화자공간모델 evolution에 기반한 continuous density hidden Markov model (CDHMM)의 online 적응에 대한 새로운 기법을 제안한다. 학습화자의 a priori knowledge을 나타내는 화자공간모델은 factor analysis (FA) 또는 probabilistic principal component analysis (PPCA)와 같은 은닉변수모델(latent variable model)에 의해 효과적으로 나타내어진다. 은닉 변수모델은 화자공간모델뿐아니라 CDHMM 파라메터의 ajoint prior분포를 표시함으로, maximum a posteriori(MAP)적응기법에 직접 적용되어진다. 화자공간모델의 hyperparameters와 CDHMM파라메터를 동시에 순차적으로 적응하기 위해 quasi-Bayes (QB)추정 기술에 기반한 online 적응기법을 제안한다. 연속숫자음 인식과 관련된 화자적응 실험을 통해 제안된 기법은 적은 적응데이터에서 좋은 성능을 나타내며, 데이터가 증가함에 따라 성능이 지속적으로 증가함을 보여준다.

  • PDF

USE OF TRAINING DATA TO ESTIMATE THE SMOOTHING PARAMETER FOR BAYESIAN IMAGE RECONSTRUCTION

  • SooJinLee
    • Journal of the Korean Geophysical Society
    • /
    • v.4 no.3
    • /
    • pp.175-182
    • /
    • 2001
  • We consider the problem of determining smoothing parameters of Gibbs priors for Bayesian methods used in the medical imaging application of emission tomographic reconstruction. We address a simple smoothing prior (membrane) whose global hyperparameter (the smoothing parameter) controls the bias/variance tradeoff of the solution. We base our maximum-likelihood (ML) estimates of hyperparameters on observed training data, and argue the motivation for this approach. Good results are obtained with a simple ML estimate of the smoothing parameter for the membrane prior.

  • PDF

Semiparametric kernel logistic regression with longitudinal data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.385-392
    • /
    • 2012
  • Logistic regression is a well known binary classification method in the field of statistical learning. Mixed-effect regression models are widely used for the analysis of correlated data such as those found in longitudinal studies. We consider kernel extensions with semiparametric fixed effects and parametric random effects for the logistic regression. The estimation is performed through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of optimal hyperparameters, cross-validation techniques are employed. Numerical results are then presented to indicate the performance of the proposed procedure.

Use of Training Data to Estimate the Smoothing Parameter for Bayesian Image Reconstruction

  • Lee, Soo-Jin
    • The Journal of Engineering Research
    • /
    • v.4 no.1
    • /
    • pp.47-54
    • /
    • 2002
  • We consider the problem of determining smoothing parameters of Gibbs priors for Bayesian methods used in the medical imaging application of emission tomographic reconstruction. We address a simple smoothing prior (membrane) whose global hyperparameter (the smoothing parameter) controls the bias/variance tradeoff of the solution. We base our maximum-likelihood(ML) estimates of hyperparameters on observed training data, and argue the motivation for this approach. Good results are obtained with a simple ML estimate of the smoothing parameter for the membrane prior.

  • PDF

Recent Research & Development Trends in Automated Machine Learning (자동 기계학습(AutoML) 기술 동향)

  • Moon, Y.H.;Shin, I.H.;Lee, Y.J.;Min, O.G.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.4
    • /
    • pp.32-42
    • /
    • 2019
  • The performance of machine learning algorithms significantly depends on how a configuration of hyperparameters is identified and how a neural network architecture is designed. However, this requires expert knowledge of relevant task domains and a prohibitive computation time. To optimize these two processes using minimal effort, many studies have investigated automated machine learning in recent years. This paper reviews the conventional random, grid, and Bayesian methods for hyperparameter optimization (HPO) and addresses its recent approaches, which speeds up the identification of the best set of hyperparameters. We further investigate existing neural architecture search (NAS) techniques based on evolutionary algorithms, reinforcement learning, and gradient derivatives and analyze their theoretical characteristics and performance results. Moreover, future research directions and challenges in HPO and NAS are described.

Optimizing Artificial Neural Network-Based Models to Predict Rice Blast Epidemics in Korea

  • Lee, Kyung-Tae;Han, Juhyeong;Kim, Kwang-Hyung
    • The Plant Pathology Journal
    • /
    • v.38 no.4
    • /
    • pp.395-402
    • /
    • 2022
  • To predict rice blast, many machine learning methods have been proposed. As the quality and quantity of input data are essential for machine learning techniques, this study develops three artificial neural network (ANN)-based rice blast prediction models by combining two ANN models, the feed-forward neural network (FFNN) and long short-term memory, with diverse input datasets, and compares their performance. The Blast_Weathe long short-term memory r_FFNN model had the highest recall score (66.3%) for rice blast prediction. This model requires two types of input data: blast occurrence data for the last 3 years and weather data (daily maximum temperature, relative humidity, and precipitation) between January and July of the prediction year. This study showed that the performance of an ANN-based disease prediction model was improved by applying suitable machine learning techniques together with the optimization of hyperparameter tuning involving input data. Moreover, we highlight the importance of the systematic collection of long-term disease data.

LSTM Model-based Prediction of the Variations in Load Power Data from Industrial Manufacturing Machines

  • Rita, Rijayanti;Kyohong, Jin;Mintae, Hwang
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.4
    • /
    • pp.295-302
    • /
    • 2022
  • This paper contains the development of a smart power device designed to collect load power data from industrial manufacturing machines, predict future variations in load power data, and detect abnormal data in advance by applying a machine learning-based prediction algorithm. The proposed load power data prediction model is implemented using a Long Short-Term Memory (LSTM) algorithm with high accuracy and relatively low complexity. The Flask and REST API are used to provide prediction results to users in a graphical interface. In addition, we present the results of experiments conducted to evaluate the performance of the proposed approach, which show that our model exhibited the highest accuracy compared with Multilayer Perceptron (MLP), Random Forest (RF), and Support Vector Machine (SVM) models. Moreover, we expect our method's accuracy could be improved by further optimizing the hyperparameter values and training the model for a longer period of time using a larger amount of data.

Performance Evaluation of a Feature-Importance-based Feature Selection Method for Time Series Prediction

  • Hyun, Ahn
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.82-89
    • /
    • 2023
  • Various machine-learning models may yield high predictive power for massive time series for time series prediction. However, these models are prone to instability in terms of computational cost because of the high dimensionality of the feature space and nonoptimized hyperparameter settings. Considering the potential risk that model training with a high-dimensional feature set can be time-consuming, we evaluate a feature-importance-based feature selection method to derive a tradeoff between predictive power and computational cost for time series prediction. We used two machine learning techniques for performance evaluation to generate prediction models from a retail sales dataset. First, we ranked the features using impurity- and Local Interpretable Model-agnostic Explanations (LIME) -based feature importance measures in the prediction models. Then, the recursive feature elimination method was applied to eliminate unimportant features sequentially. Consequently, we obtained a subset of features that could lead to reduced model training time while preserving acceptable model performance.