• Title/Summary/Keyword: Ensemble model

Search Result 662, Processing Time 0.021 seconds

A Study on the Application of Modeling to predict the Distribution of Legally Protected Species Under Climate Change - A Case Study of Rodgersia podophylla - (기후변화에 따른 법정보호종 분포 예측을 위한 종분포모델 적용 방법 검토 - Rodgersia podophylla를 중심으로 -)

  • Yoo, Youngjae;Hwang, Jinhoo;Jeon, Seong-woo
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.27 no.3
    • /
    • pp.29-43
    • /
    • 2024
  • Legally protected species are one of the crucial considerations in the field of natural ecology when conducting environmental impact assessments (EIAs). The occurrence of legally protected species, especially 'Endangered Wildlife' designated by Ministry of Environment, significantly influences the progression of projects subject to EIA, necessitating clear investigations and presentations of their habitats. In perspective of statistics, a minimum of 30 occurrence coordinates is required for population prediction, but most of endangered wildlife has insufficient coordinates and it posing challenges for distribution prediction through modeling. Consequently, this study aims to propose modeling methodologies applicable when coordinate data are limited, focusing on Rodgersia podophylla, representing characteristics of endangered wildlife and northern plant species. For this methodology, 30 random sampling coordinates were used as input data, assuming little survey data, and modeling was performed using individual models included in BIOMOD2. After that, the modeling results were evaluated by using discrimination capacity and the reality reflection ability. An optimal modeling technique was proposed by ensemble the remaining models except for the MaxEnt model, which was found to be less reliable in the modeling results. Alongside discussions on discrimination capacity metrics(e.g. TSS and AUC) presented in modeling results, this study provides insights and suggestions for improvement, but it has limitations that it is difficult to use universally because it is not a study conducted on various species. By supporting survey site selection in EIA processes, this research is anticipated to contribute to minimizing situations where protected species are overlooked in survey results.

Model Interpretation through LIME and SHAP Model Sharing (LIME과 SHAP 모델 공유에 의한 모델 해석)

  • Yong-Gil Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.177-184
    • /
    • 2024
  • In the situation of increasing data at fast speed, we use all kinds of complex ensemble and deep learning algorithms to get the highest accuracy. It's sometimes questionable how these models predict, classify, recognize, and track unknown data. Accomplishing this technique and more has been and would be the goal of intensive research and development in the data science community. A variety of reasons, such as lack of data, imbalanced data, biased data can impact the decision rendered by the learning models. Many models are gaining traction for such interpretations. Now, LIME and SHAP are commonly used, in which are two state of the art open source explainable techniques. However, their outputs represent some different results. In this context, this study introduces a coupling technique of LIME and Shap, and demonstrates analysis possibilities on the decisions made by LightGBM and Keras models in classifying a transaction for fraudulence on the IEEE CIS dataset.

Real-time Fall Accident Prediction using Random Forest in IoT Environment (사물인터넷 환경에서 랜덤포레스트를 이용한 실시간 낙상 사고 예측)

  • Chan-Woo Bang;Bong-Hyun Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.4
    • /
    • pp.27-33
    • /
    • 2024
  • As of 2023, the number of accident victims in the domestic construction industry is 26,829, ranking second only to other businesses (service industries). The accident types of casualties in all industries were falls (29,229 people), followed by falls (14,357 people). Based on the above data, this study attaches sensors to hard hats and insoles to predict fall accidents that frequently occur at construction sites, and proposes smart safety equipment that applies a random forest algorithm based on the data collected through this. The random forest model can determine fall accidents in real time with high accuracy by generating multiple decision trees and combining the predictions of each tree. This model classifies whether a worker has had a fall accident and the type of behavior through data collected from the MPU-6050 sensor attached to the hard hat. Fall accidents that are primarily determined from hard hats are secondarily predicted through sensors attached to the insole, thereby increasing prediction accuracy. It is expected that this will enable rapid response in the event of an accident, thereby reducing worker deaths and accidents.

Measurement of Turbulence Properties at the Time of Flow Reversal Under High Wave Conditions in Hujeong Beach (후정해변 고파랑 조건하에서 파랑유속 방향전환점에서 발생하는 난류성분의 측정)

  • Chang, Yeon S.;Do, Jong Dae;Kim, Sun-Sin;Ahn, Kyungmo;Jin, Jae-Youll
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.29 no.4
    • /
    • pp.206-216
    • /
    • 2017
  • The temporal distribution of the turbulence kinetic energy (TKE) and the vertical component of Reynolds stresses ($-{\bar{u^{\prime}w^{\prime}}}$) was measured during one wave period under high wave energy conditions. The wave data were obtained at Hujeong Beach in the east coast of Korea at January 14~18 of 2017 when an extratropical cyclone was developed in the East Sea. Among the whole thousands of waves measured during the period, hundreds of regular waves that had with similar pattern were selected for the analysis in order to give three representing mean wave patterns using the ensemble average technique. The turbulence properties were then estimated based on the selected wave data. It is interesting to find out that $-{\bar{u^{\prime}w^{\prime}}}$ has one clear peak near the time of flow reversal while TKE has two peaks at the corresponding times of maximum cross-shore velocity magnitudes. The distinguished pattern of Reynolds stress indicates that vertical fluxes of such properties as suspended sediments may be enhanced at the time when the horizontal flow direction is reversed to disturb the flows, supporting the turbulence convection process proposed by Nielsen (1992). The characteristic patterns of turbulence properties are examined using the CADMAS-SURF Reynolds-Averaged Navier-Stokes (RANS) model. Although the model can reasonably simulate the distribution of TKE pattern, it fails to produce the $-{\bar{u^{\prime}w^{\prime}}}$ peak at the time of flow reversal, which indicates that the application of RANS model is limited in the prediction of some turbulence properties such as Reynolds stresses.

Prediction of Potential Species Richness of Plants Adaptable to Climate Change in the Korean Peninsula (한반도 기후변화 적응 대상 식물 종풍부도 변화 예측 연구)

  • Shin, Man-Seok;Seo, Changwan;Lee, Myungwoo;Kim, Jin-Yong;Jeon, Ja-Young;Adhikari, Pradeep;Hong, Seung-Bum
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.6
    • /
    • pp.562-581
    • /
    • 2018
  • This study was designed to predict the changes in species richness of plants under the climate change in South Korea. The target species were selected based on the Plants Adaptable to Climate Change in the Korean Peninsula. Altogether, 89 species including 23 native plants, 30 northern plants, and 36 southern plants. We used the Species Distribution Model to predict the potential habitat of individual species under the climate change. We applied ten single-model algorithms and the pre-evaluation weighted ensemble method. And then, species richness was derived from the results of individual species. Two representative concentration pathways (RCP 4.5 and RCP 8.5) were used to simulate the species richness of plants in 2050 and 2070. The current species richness was predicted to be high in the national parks located in the Baekdudaegan mountain range in Gangwon Province and islands of the South Sea. The future species richness was predicted to be lower in the national park and the Baekdudaegan mountain range in Gangwon Province and to be higher for southern coastal regions. The average value of the current species richness showed that the national park area was higher than the whole area of South Korea. However, predicted species richness were not the difference between the national park area and the whole area of South Korea. The difference between current and future species richness of plants could be the disappearance of a large number of native and northern plants from South Korea. The additional reason could be the expansion of potential habitat of southern plants under climate change. However, if species dispersal to a suitable habitat was not achieved, the species richness will be reduced drastically. The results were different depending on whether species were dispersed or not. This study will be useful for the conservation planning, establishment of the protected area, restoration of biological species and strategies for adaptation of climate change.

Evaluation of Hydrogeological Characteristic of Natural Barrier in Korea for Establishing Safety Guidelines of Deep Geological High-Level Radioactive Waste Disposal Site (고준위방사성폐기물 심층처분 부지 수리 지질 안전 규제를 위한 국내 지질환경 수리 특성 평가)

  • Suwan So;Jiho Jeong;Jaesung Park;Hyeongmok Lee;Subi Lee;Sujin Kim;Sinda Mbarki;Jina Jeong
    • Economic and Environmental Geology
    • /
    • v.57 no.4
    • /
    • pp.397-416
    • /
    • 2024
  • This study assessed the hydrogeological properties of the deep geological environment to develop safety criteria for the natural barriers used in the deep geological disposal of high-level radioactive waste in Korea. The assessment focused on the distribution and trends of hydraulic conductivity and permeability properties appropriate for the domestic geological environment, using various in-situ hydraulic test data collected for groundwater development and management. To develop a depth-hydrogeological property relationship model suitable for domestic conditions, the study reviewed various international research examples and applied a representative model that explains the trends of hydraulic conductivity and permeability with depth. The development of the model suitable for Korea involved applying ensemble regression analysis to account for the uncertainty of various factors in the collected data. The results confirmed that existing international depth-hydrogeological property relationship models adequately describe the characteristics of the domestic geological environment. Considering the preferred hydrogeological criteria suggested by countries like Sweden, Germany, and Canada, there is a high likelihood that a suitable geological environment exists in Korea. Additionally, the application of hydrogeological criteria indicative of low-permeability environments showed that suitable conditions for disposal construction increase at depths greater than 300 m, where the influence of fractures on groundwater flow might be minimal at depths exceeding 500 m. This research can serve as foundational information for establishing hydrogeological safety standards for natural barriers in Korea according to international regulatory guidelines.

A Correction of East Asian Summer Precipitation Simulated by PNU/CME CGCM Using Multiple Linear Regression (다중 선형 회귀를 이용한 PNU/CME CGCM의 동아시아 여름철 강수예측 보정 연구)

  • Hwang, Yoon-Jeong;Ahn, Joong-Bae
    • Journal of the Korean earth science society
    • /
    • v.28 no.2
    • /
    • pp.214-226
    • /
    • 2007
  • Because precipitation is influenced by various atmospheric variables, it is highly nonlinear. Although precipitation predicted by a dynamic model can be corrected by using a nonlinear Artificial Neural Network, this approach has limits such as choices of the initial weight, local minima and the number of neurons, etc. In the present paper, we correct simulated precipitation by using a multiple linear regression (MLR) method, which is simple and widely used. First of all, Ensemble hindcast is conducted by the PNU/CME Coupled General Circulation Model (CGCM) (Park and Ahn, 2004) for the period from April to August in 1979-2005. MLR is applied to precipitation simulated by PNU/CME CGCM for the months of June (lead 2), July (lead 3), August (lead 4) and seasonal mean JJA (from June to August) of the Northeast Asian region including the Korean Peninsula $(110^{\circ}-145^{\circ}E,\;25-55^{\circ}N)$. We build the MLR model using a linear relationship between observed precipitation and the hindcasted results from the PNU/CME CGCM. The predictor variables selected from CGCM are precipitation, 500 hPa vertical velocity, 200 hPa divergence, surface air temperature and others. After performing a leave-oneout cross validation, the results are compared with the PNU/CME CGCM's. The results including Heidke skill scores demonstrate that the MLR corrected results have better forecasts than the direct CGCM result for rainfall.

Bayesian networks-based probabilistic forecasting of hydrological drought considering drought propagation (가뭄의 전이 현상을 고려한 수문학적 가뭄에 대한 베이지안 네트워크 기반 확률 예측)

  • Shin, Ji Yae;Kwon, Hyun-Han;Lee, Joo-Heon;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.50 no.11
    • /
    • pp.769-779
    • /
    • 2017
  • As the occurrence of drought is recently on the rise, the reliable drought forecasting is required for developing the drought mitigation and proactive management of water resources. This study developed a probabilistic hydrological drought forecasting method using the Bayesian Networks and drought propagation relationship to estimate future drought with the forecast uncertainty, named as the Propagated Bayesian Networks Drought Forecasting (PBNDF) model. The proposed PBNDF model was composed with 4 nodes of past, current, multi-model ensemble (MME) forecasted information and the drought propagation relationship. Using Palmer Hydrological Drought Index (PHDI), the PBNDF model was applied to forecast the hydrological drought condition at 10 gauging stations in Nakdong River basin. The receiver operating characteristics (ROC) curve analysis was applied to measure the forecast skill of the forecast mean values. The root mean squared error (RMSE) and skill score (SS) were employed to compare the forecast performance with previously developed forecast models (persistence forecast, Bayesian network drought forecast). We found that the forecast skill of PBNDF model showed better performance with low RMSE and high SS of 0.1~0.15. The overall results mean the PBNDF model had good potential in probabilistic drought forecasting.

Meteorological drought outlook with satellite precipitation data using Bayesian networks and decision-making model (베이지안 네트워크 및 의사결정 모형을 이용한 위성 강수자료 기반 기상학적 가뭄 전망)

  • Shin, Ji Yae;Kim, Ji-Eun;Lee, Joo-Heon;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.52 no.4
    • /
    • pp.279-289
    • /
    • 2019
  • Unlike other natural disasters, drought is a reoccurring and region-wide phenomenon after being triggered by a prolonged precipitation deficiency. Considering that remote sensing products provide consistent temporal and spatial measurements of precipitation, this study developed a remote sensing data-based drought outlook model. The meteorological drought was defined by the Standardized Precipitation Index (SPI) achieved from PERSIANN_CDR, TRMM 3B42 and GPM IMERG images. Bayesian networks were employed in this study to combine the historical drought information and dynamical prediction products in advance of drought outlook. Drought outlook was determined through a decision-making model considering the current drought condition and forecasted condition from the Bayesian networks. Drought outlook condition was classified by four states such as no drought, drought occurrence, drought persistence, and drought removal. The receiver operating characteristics (ROC) curve analysis were employed to measure the relative outlook performance with the dynamical prediction production, Multi-Model Ensemble (MME). The ROC analysis indicated that the proposed outlook model showed better performance than the MME, especially for drought occurrence and persistence of 2- and 3-month outlook.

A Recommending System for Care Plan(Res-CP) in Long-Term Care Insurance System (데이터마이닝 기법을 활용한 노인장기요양급여 권고모형 개발)

  • Han, Eun-Jeong;Lee, Jung-Suk;Kim, Dong-Geon;Ka, Im-Ok
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.6
    • /
    • pp.1229-1237
    • /
    • 2009
  • In the long-term care insurance(LTCI) system, the question of how to provide the most appropriate care has become a major issue for the elderly, their family, and for policy makers. To help beneficiaries use LTC services appropriately to their needs of care, National Health Insurance Corporation(NHIC) provide them with the individualized care plan, named the Long-term Care User Guide. It includes recommendations for beneficiaries' most appropriate type of care. The purpose of this study is to develop a recommending system for care plan(Res-CP) in LTCI system. We used data set for Long-term Care User Guide in the 3rd long-term care insurance pilot programs. To develop the model, we tested four models, including a decision-tree model in data-mining, a logistic regression model, and a boosting and boosting techniques in an ensemble model. A decision-tree model was selected to describe the Res-CP, because it may be easy to explain the algorithm of Res-CP to the working groups. Res-CP might be useful in an evidence-based care planning in LTCI system and may contribute to support use of LTC services efficiently.