• Title/Summary/Keyword: 스태킹 기법

Search Result 19, Processing Time 0.019 seconds

Two-Stage Neural Network Optimization for Robust Solar Photovoltaic Forecasting (강건한 태양광 발전량 예측을 위한 2단계 신경망 최적화)

  • Jinyeong Oh;Dayeong So;Jihoon Moon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.31-34
    • /
    • 2024
  • 태양광 에너지는 탄소 중립 이행을 위한 주요 방안으로 많은 주목을 받고 있다. 태양광 발전량은 여러 환경적 요인에 따라 크게 달라질 수 있으므로, 정확한 발전량 예측은 전력 네트워크의 안정성과 효율적인 에너지 관리에 근본적으로 중요하다. 대표적인 인공지능 기술인 신경망(Neural Network)은 불안정한 환경 변수와 복잡한 상호작용을 효과적으로 학습할 수 있어 태양광 발전량 예측에서 우수한 성능을 도출하였다. 하지만, 신경망은 모델의 구조나 초매개변수(Hyperparameter)를 최적화하는 것은 복잡하고 시간이 많이 드는 작업이므로, 에너지 분야에서 실제 산업 적용에 한계가 존재한다. 본 논문은 2단계 신경망 최적화를 통한 태양광 발전량 예측 기법을 제안한다. 먼저, 태양광 발전량 데이터 셋을 훈련 집합과 평가 집합으로 분할한다. 훈련 집합에서, 각기 다른 은닉층의 개수로 구성된 여러 신경망 모델을 구성하고, 모델별로 Optuna를 적용하여 최적의 초매개변숫값을 선정한다. 다음으로, 은닉층별 최적화된 신경망 모델을 이용해 훈련과 평가 집합에서는 각각 5겹 교차검증을 적용한 발전량 추정값과 예측값을 출력한다. 마지막으로, 스태킹 앙상블 방식을 채택해 기본 초매개변숫값으로 설정해도 우수한 성능을 도출하는 랜덤 포레스트를 이용하여 추정값을 학습하고, 평가 집합의 예측값을 입력으로 받아 최종 태양광 발전량을 예측한다. 인천 지역으로 실험한 결과, 제안한 방식은 모델링이 간편할 뿐만 아니라 여러 신경망 모델보다 우수한 예측 성능을 도출하였으며, 이를 바탕으로 국내 에너지 산업에 이바지할 수 있을 것으로 기대한다.

  • PDF

Monitoring Ground-level SO2 Concentrations Based on a Stacking Ensemble Approach Using Satellite Data and Numerical Models (위성 자료와 수치모델 자료를 활용한 스태킹 앙상블 기반 SO2 지상농도 추정)

  • Choi, Hyunyoung;Kang, Yoojin;Im, Jungho;Shin, Minso;Park, Seohui;Kim, Sang-Min
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1053-1066
    • /
    • 2020
  • Sulfur dioxide (SO2) is primarily released through industrial, residential, and transportation activities, and creates secondary air pollutants through chemical reactions in the atmosphere. Long-term exposure to SO2 can result in a negative effect on the human body causing respiratory or cardiovascular disease, which makes the effective and continuous monitoring of SO2 crucial. In South Korea, SO2 monitoring at ground stations has been performed, but this does not provide spatially continuous information of SO2 concentrations. Thus, this research estimated spatially continuous ground-level SO2 concentrations at 1 km resolution over South Korea through the synergistic use of satellite data and numerical models. A stacking ensemble approach, fusing multiple machine learning algorithms at two levels (i.e., base and meta), was adopted for ground-level SO2 estimation using data from January 2015 to April 2019. Random forest and extreme gradient boosting were used as based models and multiple linear regression was adopted for the meta-model. The cross-validation results showed that the meta-model produced the improved performance by 25% compared to the base models, resulting in the correlation coefficient of 0.48 and root-mean-square-error of 0.0032 ppm. In addition, the temporal transferability of the approach was evaluated for one-year data which were not used in the model development. The spatial distribution of ground-level SO2 concentrations based on the proposed model agreed with the general seasonality of SO2 and the temporal patterns of emission sources.

Prediction of Uniaxial Compressive Strength of Rock using Shield TBM Machine Data and Machine Learning Technique (쉴드 TBM 기계 데이터 및 머신러닝 기법을 이용한 암석의 일축압축강도 예측)

  • Kim, Tae-Hwan;Ko, Tae Young;Park, Yang Soo;Kim, Taek Kon;Lee, Dae Hyuk
    • Tunnel and Underground Space
    • /
    • v.30 no.3
    • /
    • pp.214-225
    • /
    • 2020
  • Uniaxial compressive strength (UCS) of rock is one of the important factors to determine the advance speed during shield TBM tunnel excavation. UCS can be obtained through the Geotechnical Data Report (GDR), and it is difficult to measure UCS for all tunneling alignment. Therefore, the purpose of this study is to predict UCS by utilizing TBM machine driving data and machine learning technique. Several machine learning techniques were compared to predict UCS, and it was confirmed the stacking model has the most successful prediction performance. TBM machine data and UCS used in the analysis were obtained from the excavation of rock strata with slurry shield TBMs. The data were divided into 8:2 for training and test and pre-processed including feature selection, scaling, and outlier removal. After completing the hyper-parameter tuning, the stacking model was evaluated with the root-mean-square error (RMSE) and the determination coefficient (R2), and it was found to be 5.556 and 0.943, respectively. Based on the results, the sacking models are considered useful in predicting rock strength with TBM excavation data.

Design of An Axial Flow Fan with Shape Optimization (형상 최적화를 통한 축류송풍기의 설계)

  • Seo Seoung-Jin;Choi Seung-Man;Kim Kwang-Yong
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.30 no.7 s.250
    • /
    • pp.603-611
    • /
    • 2006
  • This paper presents the response surface optimization method using three-dimensional Wavier-Stokes analysis to optimize the blade shape of an axial flow fan. Reynolds-averaged Wavier-Stokes equations with $k-{\epsilon}$ turbulence model are discretized with finite volume approximations using the unstructured grid. Regression analysis is used for generating response surface, and it is validated by ANOVA and t-statistics. Four geometric variables, i.e., sweep and lean angles at mean and tip respectively were employed to improve the efficiency. The computational results are compared with experimental data and the comparisons show generally good agreements. As a main result of the optimization, the total efficiency was successfully improved. Also, detailed effects of sweep and lean on the axial flow fan are discussed.

Aerodynamic Design Optimization of A Transonic Axial Compressor Rotor with Readjustment of A Design Point (설계유량을 고려한 천음속 축류압축기 동익의 삼차원 형상최적설계)

  • Ko, Woo-Sik;Kim, Kwang-Yong;Ko, Sung-Ho
    • 유체기계공업학회:학술대회논문집
    • /
    • 2003.12a
    • /
    • pp.639-645
    • /
    • 2003
  • Design optimization of a transonic compressor rotor (NASA rotor 37) using response surface method and three-dimensional Navier-Stokes analysis has been carried out in this work. Baldwin-Lomax turbulence model was used in the flow analysis. Two design variables were selected to optimize the stacking line of the blade, and mass flow was used as a design variable, as well, to obtain new design point at peak efficiency. Data points for response evaluations were selected by D-optimal design, and linear programming method was used for the optimization on the response surface. As a main result of the optimization, adiabatic efficiency was successfully improved, and new design mass flow that is appropriate to an improved blade was obtained. Also, it is found that the design process provides reliable design of a turbomachinery blade with reasonable computing time.

  • PDF

Diabetes prediction mechanism using machine learning model based on patient IQR outlier and correlation coefficient (환자 IQR 이상치와 상관계수 기반의 머신러닝 모델을 이용한 당뇨병 예측 메커니즘)

  • Jung, Juho;Lee, Naeun;Kim, Sumin;Seo, Gaeun;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1296-1301
    • /
    • 2021
  • With the recent increase in diabetes incidence worldwide, research has been conducted to predict diabetes through various machine learning and deep learning technologies. In this work, we present a model for predicting diabetes using machine learning techniques with German Frankfurt Hospital data. We apply outlier handling using Interquartile Range (IQR) techniques and Pearson correlation and compare model-specific diabetes prediction performance with Decision Tree, Random Forest, Knn (k-nearest neighbor), SVM (support vector machine), Bayesian Network, ensemble techniques XGBoost, Voting, and Stacking. As a result of the study, the XGBoost technique showed the best performance with 97% accuracy on top of the various scenarios. Therefore, this study is meaningful in that the model can be used to accurately predict and prevent diabetes prevalent in modern society.

Dementia Prediction Model based on Gradient Boosting (이기종 머신러닝 모델 기반 치매예측 모델)

  • Lee, Taein;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1729-1738
    • /
    • 2021
  • Machine learning has a close relationship with cognitive psychology and brain science and is developing together. This paper analyzes the OASIS-3 dataset using machine learning techniques and proposes a model for predicting dementia. Dimensional reduction through PCA (Principal Component Analysis) is performed on the data quantifying the volume of each area among OASIS-3 data, and only important elements (features) are extracted and then various machine learning including gradient boosting and stacking Apply the models and compare the performance of each. Unlike previous studies, the proposed technique has a great differentiation because it uses not only the brain biometric data, but also basic information data such as the participant's gender and medical information data of the participant. In addition, it was shown that the proposed technique through various performance evaluations is a model that can better predict dementia by finding features that are more related to dementia among various numerical data.

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.

Mapping Precise Two-dimensional Surface Deformation on Kilauea Volcano, Hawaii using ALOS2 PALSAR2 Spotlight SAR Interferometry (ALOS-2 PALSAR-2 Spotlight 영상의 위성레이더 간섭기법을 활용한 킬라우에아 화산의 정밀 2차원 지표변위 매핑)

  • Hong, Seong-Jae;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_3
    • /
    • pp.1235-1249
    • /
    • 2019
  • Kilauea Volcano is one of the most active volcano in the world. In this study, we used the ALOS-2 PALSAR-2 satellite imagery to measure the surface deformation occurring near the summit of the Kilauea volcano from 2015 to 2017. In order to measure two-dimensional surface deformation, interferometric synthetic aperture radar (InSAR) and multiple aperture SAR interferometry (MAI) methods were performed using two interferometric pairs. To improve the precision of 2D measurement, we compared root-mean-squared deviation (RMSD) of the difference of measurement value as we change the effective antenna length and normalized squint value, which are factors that can affect the measurement performance of the MAI method. Through the compare, the values of the factors, which can measure deformation most precisely, were selected. After select optimal values of the factors, the RMSD values of the difference of the MAI measurement were decreased from 4.07 cm to 2.05 cm. In each interferograms, the maximum deformation in line-of-sight direction is -28.6 cm and -27.3 cm, respectively, and the maximum deformation in the along-track direction is 20.2 cm and 20.8 cm, in the opposite direction is -24.9 cm and -24.3 cm, respectively. After stacking the two interferograms, two-dimensional surface deformation mapping was performed, and a maximum surface deformation of approximately 30.4 cm was measured in the northwest direction. In addition, large deformation of more than 20 cm were measured in all directions. The measurement results show that the risk of eruption activity is increasing in Kilauea Volcano. The measurements of the surface deformation of Kilauea volcano from 2015 to 2017 are expected to be helpful for the study of the eruption activity of Kilauea volcano in the future.