• Title/Summary/Keyword: optimizing input data

Search Result 48, Processing Time 0.029 seconds

Development of Power Demand Forecasting Algorithm Using GMDH (GMDH를 이용한 전력 수요 예측 알고리즘 개발)

  • Lee, Dong-Chul;Hong, Yeon-Chan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.3
    • /
    • pp.360-365
    • /
    • 2003
  • In this paper, GMDH(Croup Method of Data Handling) algorithm which is proved to be more excellent in efficiency and accuracy of practical use of data is applied to electric power demand forecasting. As a result, it became much easier to make a choice of input data and make an exact prediction based on a lot of data. Also, we considered both economy factors(GDP, export, import, number of employee, number of economically active population and consumption of oil) and climate factors(average temperature) when forecasting. We assumed target forecast period from first quarter 1999 to first quarter 2001, and suggested more accurate forecasting method of electric power demand by using 3-step computer simulation processes(first process for selecting optimum input period, second for analyzing time relation of input data and forecast value, and third for optimizing input data) for improvement of forecast precision. The proposed method can get 0.96 percent of mean error rate at target forecast period.

Optimizing Artificial Neural Network-Based Models to Predict Rice Blast Epidemics in Korea

  • Lee, Kyung-Tae;Han, Juhyeong;Kim, Kwang-Hyung
    • The Plant Pathology Journal
    • /
    • v.38 no.4
    • /
    • pp.395-402
    • /
    • 2022
  • To predict rice blast, many machine learning methods have been proposed. As the quality and quantity of input data are essential for machine learning techniques, this study develops three artificial neural network (ANN)-based rice blast prediction models by combining two ANN models, the feed-forward neural network (FFNN) and long short-term memory, with diverse input datasets, and compares their performance. The Blast_Weathe long short-term memory r_FFNN model had the highest recall score (66.3%) for rice blast prediction. This model requires two types of input data: blast occurrence data for the last 3 years and weather data (daily maximum temperature, relative humidity, and precipitation) between January and July of the prediction year. This study showed that the performance of an ANN-based disease prediction model was improved by applying suitable machine learning techniques together with the optimization of hyperparameter tuning involving input data. Moreover, we highlight the importance of the systematic collection of long-term disease data.

Optimizing Input Parameters of Paralichthys olivaceus Disease Classification based on SHAP Analysis (SHAP 분석 기반의 넙치 질병 분류 입력 파라미터 최적화)

  • Kyung-Won Cho;Ran Baik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1331-1336
    • /
    • 2023
  • In text-based fish disease classification using machine learning, there is a problem that the input parameters of the machine learning model are too many, but due to performance problems, the input parameters cannot be arbitrarily reduced. This paper proposes a method of optimizing input parameters specialized for Paralichthys olivaceus disease classification using SHAP analysis techniques to solve this problem,. The proposed method includes data preprocessing of disease information extracted from the halibut disease questionnaire by applying the SHAP analysis technique and evaluating a machine learning model using AutoML. Through this, the performance of the input parameters of AutoML is evaluated and the optimal input parameter combination is derived. In this study, the proposed method is expected to be able to maintain the existing performance while reducing the number of input parameters required, which will contribute to enhancing the efficiency and practicality of text-based Paralichthys olivaceus disease classification.

New Fuzzy Inference System Using a Kernel-based Method

  • Kim, Jong-Cheol;Won, Sang-Chul;Suga, Yasuo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.2393-2398
    • /
    • 2003
  • In this paper, we proposes a new fuzzy inference system for modeling nonlinear systems given input and output data. In the suggested fuzzy inference system, the number of fuzzy rules and parameter values of membership functions are automatically decided by using the kernel-based method. The kernel-based method individually performs linear transformation and kernel mapping. Linear transformation projects input space into linearly transformed input space. Kernel mapping projects linearly transformed input space into high dimensional feature space. The structure of the proposed fuzzy inference system is equal to a Takagi-Sugeno fuzzy model whose input variables are weighted linear combinations of input variables. In addition, the number of fuzzy rules can be reduced under the condition of optimizing a given criterion by adjusting linear transformation matrix and parameter values of kernel functions using the gradient descent method. Once a structure is selected, coefficients in consequent part are determined by the least square method. Simulated result illustrates the effectiveness of the proposed technique.

  • PDF

Optimizing Hydrological Quantitative Precipitation Forecast (HQPF) based on Machine Learning for Rainfall Impact Forecasting (호우 영향예보를 위한 머신러닝 기반의 수문학적 정량강우예측(HQPF) 최적화 방안)

  • Lee, Han-Su;Jee, Yongkeun;Lee, Young-Mi;Kim, Byung-Sik
    • Journal of Environmental Science International
    • /
    • v.30 no.12
    • /
    • pp.1053-1065
    • /
    • 2021
  • In this study, the prediction technology of Hydrological Quantitative Precipitation Forecast (HQPF) was improved by optimizing the weather predictors used as input data for machine learning. Results comparison was conducted using bias and Root Mean Square Error (RMSE), which are predictive accuracy verification indicators, based on the heavy rain case on August 21, 2021. By comparing the rainfall simulated using the improved HQPF and the observed accumulated rainfall, it was revealed that all HQPFs (conventional HQPF and improved HQPF 1 and HQPF 2) showed a decrease in rainfall as the lead time increased for the entire grid region. Hence, the difference from the observed rainfall increased. In the accumulated rainfall evaluation due to the reduction of input factors, compared to the existing HQPF, improved HQPF 1 and 2 predicted a larger accumulated rainfall. Furthermore, HQPF 2 used the lowest number of input factors and simulated more accumulated rainfall than that projected by conventional HQPF and HQPF 1. By improving the performance of conventional machine learning despite using lesser variables, the preprocessing period and model execution time can be reduced, thereby contributing to model optimization. As an additional advanced method of HQPF 1 and 2 mentioned above, a simulated analysis of the Local ENsemble prediction System (LENS) ensemble member and low pressure, one of the observed meteorological factors, was analyzed. Based on the results of this study, if we select for the positively performing ensemble members based on the heavy rain characteristics of Korea or apply additional weights differently for each ensemble member, the prediction accuracy is expected to increase.

Monophthong Recognition Optimizing Muscle Mixing Based on Facial Surface EMG Signals (안면근육 표면근전도 신호기반 근육 조합 최적화를 통한 단모음인식)

  • Lee, Byeong-Hyeon;Ryu, Jae-Hwan;Lee, Mi-Ran;Kim, Deok-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.3
    • /
    • pp.143-150
    • /
    • 2016
  • In this paper, we propose Korean monophthong recognition method optimizing muscle mixing based on facial surface EMG signals. We observed that EMG signal patterns and muscle activity may vary according to Korean monophthong pronunciation. We use RMS, VAR, MMAV1, MMAV2 which were shown high recognition accuracy in previous study and Cepstral Coefficients as feature extraction algorithm. And we classify Korean monophthong by QDA(Quadratic Discriminant Analysis) and HMM(Hidden Markov Model). Muscle mixing optimized using input data in training phase, optimized result is applied in recognition phase. Then New data are input, finally Korean monophthong are recognized. Experimental results show that the average recognition accuracy is 85.7% in QDA, 75.1% in HMM.

Traffic-based reinforcement learning with neural network algorithm in fog computing environment

  • Jung, Tae-Won;Lee, Jong-Yong;Jung, Kye-Dong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.144-150
    • /
    • 2020
  • Reinforcement learning is a technology that can present successful and creative solutions in many areas. This reinforcement learning technology was used to deploy containers from cloud servers to fog servers to help them learn the maximization of rewards due to reduced traffic. Leveraging reinforcement learning is aimed at predicting traffic in the network and optimizing traffic-based fog computing network environment for cloud, fog and clients. The reinforcement learning system collects network traffic data from the fog server and IoT. Reinforcement learning neural networks, which use collected traffic data as input values, can consist of Long Short-Term Memory (LSTM) neural networks in network environments that support fog computing, to learn time series data and to predict optimized traffic. Description of the input and output values of the traffic-based reinforcement learning LSTM neural network, the composition of the node, the activation function and error function of the hidden layer, the overfitting method, and the optimization algorithm.

Seismic response of soil-structure interaction using the support vector regression

  • Mirhosseini, Ramin Tabatabaei
    • Structural Engineering and Mechanics
    • /
    • v.63 no.1
    • /
    • pp.115-124
    • /
    • 2017
  • In this paper, a different technique to predict the effects of soil-structure interaction (SSI) on seismic response of building systems is investigated. The technique use a machine learning algorithm called Support Vector Regression (SVR) with technical and analytical results as input features. Normally, the effects of SSI on seismic response of existing building systems can be identified by different types of large data sets. Therefore, predicting and estimating the seismic response of building is a difficult task. It is possible to approximate a real valued function of the seismic response and make accurate investing choices regarding the design of building system and reduce the risk involved, by giving the right experimental and/or numerical data to a machine learning regression, such as SVR. The seismic response of both single-degree-of-freedom system and six-storey RC frame which can be represent of a broad range of existing structures, is estimated using proposed SVR model, while allowing flexibility of the soil-foundation system and SSI effects. The seismic response of both single-degree-of-freedom system and six-storey RC frame which can be represent of a broad range of existing structures, is estimated using proposed SVR model, while allowing flexibility of the soil-foundation system and SSI effects. The results show that the performance of the technique can be predicted by reducing the number of real data input features. Further, performance enhancement was achieved by optimizing the RBF kernel and SVR parameters through grid search.

Fine-tuning BERT-based NLP Models for Sentiment Analysis of Korean Reviews: Optimizing the sequence length (BERT 기반 자연어처리 모델의 미세 조정을 통한 한국어 리뷰 감성 분석: 입력 시퀀스 길이 최적화)

  • Sunga Hwang;Seyeon Park;Beakcheol Jang
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.47-56
    • /
    • 2024
  • This paper proposes a method for fine-tuning BERT-based natural language processing models to perform sentiment analysis on Korean review data. By varying the input sequence length during this process and comparing the performance, we aim to explore the optimal performance according to the input sequence length. For this purpose, text review data collected from the clothing shopping platform M was utilized. Through web scraping, review data was collected. During the data preprocessing stage, positive and negative satisfaction scores were recalibrated to improve the accuracy of the analysis. Specifically, the GPT-4 API was used to reset the labels to reflect the actual sentiment of the review texts, and data imbalance issues were addressed by adjusting the data to 6:4 ratio. The reviews on the clothing shopping platform averaged about 12 tokens in length, and to provide the optimal model suitable for this, five BERT-based pre-trained models were used in the modeling stage, focusing on input sequence length and memory usage for performance comparison. The experimental results indicated that an input sequence length of 64 generally exhibited the most appropriate performance and memory usage. In particular, the KcELECTRA model showed optimal performance and memory usage at an input sequence length of 64, achieving higher than 92% accuracy and reliability in sentiment analysis of Korean review data. Furthermore, by utilizing BERTopic, we provide a Korean review sentiment analysis process that classifies new incoming review data by category and extracts sentiment scores for each category using the final constructed model.

Development of Flash Boiling Spray Prediction Model of Multi-hole GDI Injector Using Machine Learning (머신러닝을 이용한 다공형 GDI 인젝터의 플래시 보일링 분무 예측 모델 개발)

  • Chang, Mengzhao;Shin, Dalho;Pham, Quangkhai;Park, Suhan
    • Journal of ILASS-Korea
    • /
    • v.27 no.2
    • /
    • pp.57-65
    • /
    • 2022
  • The purpose of this study is to use machine learning to build a model capable of predicting the flash boiling spray characteristics. In this study, the flash boiling spray was visualized using Shadowgraph visualization technology, and then the spray image was processed with MATLAB to obtain quantitative data of spray characteristics. The experimental conditions were used as input, and the spray characteristics were used as output to train the machine learning model. For the machine learning model, the XGB (extreme gradient boosting) algorithm was used. Finally, the performance of machine learning model was evaluated using R2 and RMSE (root mean square error). In order to have enough data to train the machine learning model, this study used 12 injectors with different design parameters, and set various fuel temperatures and ambient pressures, resulting in about 12,000 data. By comparing the performance of the model with different amounts of training data, it was found that the number of training data must reach at least 7,000 before the model can show optimal performance. The model showed different prediction performances for different spray characteristics. Compared with the upstream spray angle and the downstream spray angle, the model had the best prediction performance for the spray tip penetration. In addition, the prediction performance of the model showed a relatively poor trend in the initial stage of injection and the final stage of injection. The model performance is expired to be further enhanced by optimizing the hyper-parameters input into the model.