• Title/Summary/Keyword: Random Forest (RF)

Search Result 184, Processing Time 0.024 seconds

Development of High-frequency Data-based Inflow Water Temperature Prediction Model and Prediction of Changesin Stratification Strength of Daecheong Reservoir Due to Climate Change (고빈도 자료기반 유입 수온 예측모델 개발 및 기후변화에 따른 대청호 성층강도 변화 예측)

  • Han, Jongsu;Kim, Sungjin;Kim, Dongmin;Lee, Sawoo;Hwang, Sangchul;Kim, Jiwon;Chung, Sewoong
    • Journal of Environmental Impact Assessment
    • /
    • v.30 no.5
    • /
    • pp.271-296
    • /
    • 2021
  • Since the thermal stratification in a reservoir inhibits the vertical mixing of the upper and lower layers and causes the formation of a hypoxia layer and the enhancement of nutrients release from the sediment, changes in the stratification structure of the reservoir according to future climate change are very important in terms of water quality and aquatic ecology management. This study was aimed to develop a data-driven inflow water temperature prediction model for Daecheong Reservoir (DR), and to predict future inflow water temperature and the stratification structure of DR considering future climate scenarios of Representative Concentration Pathways (RCP). The random forest (RF)regression model (NSE 0.97, RMSE 1.86℃, MAPE 9.45%) developed to predict the inflow temperature of DR adequately reproduced the statistics and variability of the observed water temperature. Future meteorological data for each RCP scenario predicted by the regional climate model (HadGEM3-RA) was input into RF model to predict the inflow water temperature, and a three-dimensional hydrodynamic model (AEM3D) was used to predict the change in the future (2018~2037, 2038~2057, 2058~2077, 2078~2097) stratification structure of DR due to climate change. As a result, the rates of increase in air temperature and inflow water temperature was 0.14~0.48℃/10year and 0.21~0.41℃/10year,respectively. As a result of seasonal analysis, in all scenarios except spring and winter in the RCP 2.6, the increase in inflow water temperature was statistically significant, and the increase rate was higher as the carbon reduction effort was weaker. The increase rate of the surface water temperature of the reservoir was in the range of 0.04~0.38℃/10year, and the stratification period was gradually increased in all scenarios. In particular, when the RCP 8.5 scenario is applied, the number of stratification days is expected to increase by about 24 days. These results were consistent with the results of previous studies that climate change strengthens the stratification intensity of lakes and reservoirs and prolonged the stratification period, and suggested that prolonged water temperature stratification could cause changes in the aquatic ecosystem, such as spatial expansion of the low-oxygen layer, an increase in sediment nutrient release, and changed in the dominant species of algae in the water body.

Spatial Gap-filling of GK-2A/AMI Hourly AOD Products Using Meteorological Data and Machine Learning (기상모델자료와 기계학습을 이용한 GK-2A/AMI Hourly AOD 산출물의 결측화소 복원)

  • Youn, Youjeong;Kang, Jonggu;Kim, Geunah;Park, Ganghyun;Choi, Soyeon;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_3
    • /
    • pp.953-966
    • /
    • 2022
  • Since aerosols adversely affect human health, such as deteriorating air quality, quantitative observation of the distribution and characteristics of aerosols is essential. Recently, satellite-based Aerosol Optical Depth (AOD) data is used in various studies as periodic and quantitative information acquisition means on the global scale, but optical sensor-based satellite AOD images are missing in some areas with cloud conditions. In this study, we produced gap-free GeoKompsat 2A (GK-2A) Advanced Meteorological Imager (AMI) AOD hourly images after generating a Random Forest based gap-filling model using grid meteorological and geographic elements as input variables. The accuracy of the model is Mean Bias Error (MBE) of -0.002 and Root Mean Square Error (RMSE) of 0.145, which is higher than the target accuracy of the original data and considering that the target object is an atmospheric variable with Correlation Coefficient (CC) of 0.714, it is a model with sufficient explanatory power. The high temporal resolution of geostationary satellites is suitable for diurnal variation observation and is an important model for other research such as input for atmospheric correction, estimation of ground PM, analysis of small fires or pollutants.

High frequency somatic embryogenesis and plant regeneration of interspecific ginseng hybrid between Panax ginseng and Panax quinquefolius

  • Kim, Jong Youn;Adhikari, Prakash Babu;Ahn, Chang Ho;Kim, Dong Hwi;Kim, Young Chang;Han, Jung Yeon;Kondeti, Subramanyam;Choi, Yong Eui
    • Journal of Ginseng Research
    • /
    • v.43 no.1
    • /
    • pp.38-48
    • /
    • 2019
  • Background: Interspecific ginseng hybrid, Panax ginseng ${\times}$ Panax quenquifolius (Pgq) has vigorous growth and produces larger roots than its parents. However, F1 progenies are complete male sterile. Plant tissue culture technology can circumvent the issue and propagate the hybrid. Methods: Murashige and Skoog (MS) medium with different concentrations (0, 2, 4, and 6 mg/L) of 2,4-dichlorophenoxyacetic acid (2,4-D) was used for callus induction and somatic embryogenesis (SE). The embryos, after culturing on $GA_3$ supplemented medium, were transferred to hormone free 1/2 Schenk and Hildebrandt (SH) medium. The developed taproots with dormant buds were treated with $GA_3$ to break the bud dormancy, and transferred to soil. Hybrid Pgq plants were verified by random amplified polymorphic DNA (RAPD) and inter simple sequence repeat (ISSR) analyses and by LC-IT-TOF-MS. Results: We conducted a comparative study of somatic embryogenesis (SE) in Pgq and its parents, and attempted to establish the soil transfer of in vitro propagated Pgq tap roots. The Pgq explants showed higher rate of embryogenesis (~56% at 2 mg/L 2,4-D concentration) as well as higher number of embryos per explants (~7 at the same 2,4-D concentration) compared to its either parents. The germinated embryos, after culturing on $GA_3$ supplemented medium, were transferred to hormone free 1/2 SH medium to support the continued growth and kept until nutrient depletion induced senescence (NuDIS) of leaf defoliation occurred (4 months). By that time, thickened tap roots with well-developed lateral roots and dormant buds were obtained. All Pgq tap roots pretreated with 20 mg/L $GA_3$ for at least a week produced new shoots after soil transfer. We selected the discriminatory RAPD and ISSR markers to find the interspecific ginseng hybrid among its parents. The $F_1$ hybrid (Pgq) contained species specific 2 ginsenosides (ginsenoside Rf in P. ginseng and pseudoginsenosides $F_{11}$ in P. quinquefolius), and higher amount of other ginsenosides than its parents. Conclusion: Micropropagation of interspecific hybrid ginseng can give an opportunity for continuous production of plants.

Cloud Removal Using Gaussian Process Regression for Optical Image Reconstruction

  • Park, Soyeon;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.4
    • /
    • pp.327-341
    • /
    • 2022
  • Cloud removal is often required to construct time-series sets of optical images for environmental monitoring. In regression-based cloud removal, the selection of an appropriate regression model and the impact analysis of the input images significantly affect the prediction performance. This study evaluates the potential of Gaussian process (GP) regression for cloud removal and also analyzes the effects of cloud-free optical images and spectral bands on prediction performance. Unlike other machine learning-based regression models, GP regression provides uncertainty information and automatically optimizes hyperparameters. An experiment using Sentinel-2 multi-spectral images was conducted for cloud removal in the two agricultural regions. The prediction performance of GP regression was compared with that of random forest (RF) regression. Various combinations of input images and multi-spectral bands were considered for quantitative evaluations. The experimental results showed that using multi-temporal images with multi-spectral bands as inputs achieved the best prediction accuracy. Highly correlated adjacent multi-spectral bands and temporally correlated multi-temporal images resulted in an improved prediction accuracy. The prediction performance of GP regression was significantly improved in predicting the near-infrared band compared to that of RF regression. Estimating the distribution function of input data in GP regression could reflect the variations in the considered spectral band with a broader range. In particular, GP regression was superior to RF regression for reproducing structural patterns at both sites in terms of structural similarity. In addition, uncertainty information provided by GP regression showed a reasonable similarity to prediction errors for some sub-areas, indicating that uncertainty estimates may be used to measure the prediction result quality. These findings suggest that GP regression could be beneficial for cloud removal and optical image reconstruction. In addition, the impact analysis results of the input images provide guidelines for selecting optimal images for regression-based cloud removal.

Performance Evaluation of Machine Learning Algorithms for Cloud Removal of Optical Imagery: A Case Study in Cropland (광학 영상의 구름 제거를 위한 기계학습 알고리즘의 예측 성능 평가: 농경지 사례 연구)

  • Soyeon Park;Geun-Ho Kwak;Ho-Yong Ahn;No-Wook Park
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.507-519
    • /
    • 2023
  • Multi-temporal optical images have been utilized for time-series monitoring of croplands. However, the presence of clouds imposes limitations on image availability, often requiring a cloud removal procedure. This study assesses the applicability of various machine learning algorithms for effective cloud removal in optical imagery. We conducted comparative experiments by focusing on two key variables that significantly influence the predictive performance of machine learning algorithms: (1) land-cover types of training data and (2) temporal variability of land-cover types. Three machine learning algorithms, including Gaussian process regression (GPR), support vector machine (SVM), and random forest (RF), were employed for the experiments using simulated cloudy images in paddy fields of Gunsan. GPR and SVM exhibited superior prediction accuracy when the training data had the same land-cover types as the cloud region, and GPR showed the best stability with respect to sampling fluctuations. In addition, RF was the least affected by the land-cover types and temporal variations of training data. These results indicate that GPR is recommended when the land-cover type and spectral characteristics of the training data are the same as those of the cloud region. On the other hand, RF should be applied when it is difficult to obtain training data with the same land-cover types as the cloud region. Therefore, the land-cover types in cloud areas should be taken into account for extracting informative training data along with selecting the optimal machine learning algorithm.

Land Cover Classification of High-Spatial Resolution Imagery using Fixed-Wing UAV (고정익 UAV를 이용한 고해상도 영상의 토지피복분류)

  • Yang, Sung-Ryong;Lee, Hak-Sool
    • Journal of the Society of Disaster Information
    • /
    • v.14 no.4
    • /
    • pp.501-509
    • /
    • 2018
  • Purpose: UAV-based photo measurements are being researched using UAVs in the space information field as they are not only cost-effective compared to conventional aerial imaging but also easy to obtain high-resolution data on desired time and location. In this study, the UAV-based high-resolution images were used to perform the land cover classification. Method: RGB cameras were used to obtain high-resolution images, and in addition, multi-distribution cameras were used to photograph the same regions in order to accurately classify the feeding areas. Finally, Land cover classification was carried out for a total of seven classes using created ortho image by RGB and multispectral camera, DSM(Digital Surface Model), NDVI(Normalized Difference Vegetation Index), GLCM(Gray-Level Co-occurrence Matrix) using RF (Random Forest), a representative supervisory classification system. Results: To assess the accuracy of the classification, an accuracy assessment based on the error matrix was conducted, and the accuracy assessment results were verified that the proposed method could effectively classify classes in the region by comparing with the supervisory results using RGB images only. Conclusion: In case of adding orthoimage, multispectral image, NDVI and GLCM proposed in this study, accuracy was higher than that of conventional orthoimage. Future research will attempt to improve classification accuracy through the development of additional input data.

Research on the Evaluation and Utilization of Constitutional Diagnosis by Korean Doctors using AI-based Evaluation Tool (인공지능 기반 평가 도구를 이용한 한의사의 체질 진단 평가 및 활용 방안에 대한 연구)

  • Park, Musun;Hwang, Minwoo;Lee, Jeongyun;Kim, Chang-Eop;Kwon, Young-Kyu
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.36 no.2
    • /
    • pp.73-78
    • /
    • 2022
  • Since Traditional Korean medicine (TKM) doctors use various knowledge systems during treatment, diagnosis results may differ for each TKM doctor. However, it is difficult to explain all the reasons for the diagnosis because TKM doctors use both explicit and implicit knowledge. In this study, an upgraded random forest (RF)-based evaluation tool was proposed to extract clinical knowledge of TKM doctors. Also, it was confirmed to what extent the professor's clinical knowledge was delivered to the trainees by using the evaluation tool. The data used to construct the evaluation tool were targeted at 106 people who visited the Sasang Constitutional Department at Kyung Hee University Korean Medicine Hospital at Gangdong. For explicit knowledge extraction, four TKM doctors were asked to express the importance of symptoms as scores. In addition, for implicit knowledge extraction, importance score was confirmed in the RF model that learned the patient's symptoms and the TKM doctor's constitutional determination results. In order to confirm the delivery of clinical knowledge, the similarity of symptoms that professors and trainees consider important when discriminating constitution was calculated using the Jaccard coefficient. As a result of the study, our proposed tool was able to successfully evaluate the clinical knowledge of TKM doctors. Also, it was confirmed that the professor's clinical knowledge was delivered to the trainee. Our tool can be used in various fields such as providing feedback on treatment, education of training TKM doctors, and development of AI in TKM.

A Ship-Wake Joint Detection Using Sentinel-2 Imagery

  • Woojin, Jeon;Donghyun, Jin;Noh-hun, Seong;Daeseong, Jung;Suyoung, Sim;Jongho, Woo;Yugyeong, Byeon;Nayeon, Kim;Kyung-Soo, Han
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.1
    • /
    • pp.77-86
    • /
    • 2023
  • Ship detection is widely used in areas such as maritime security, maritime traffic, fisheries management, illegal fishing, and border control, and ship detection is important for rapid response and damage minimization as ship accident rates increase due to recent increases in international maritime traffic. Currently, according to a number of global and national regulations, ships must be equipped with automatic identification system (AIS), which provide information such as the location and speed of the ship periodically at regular intervals. However, most small vessels (less than 300 tons) are not obligated to install the transponder and may not be transmitted intentionally or accidentally. There is even a case of misuse of the ship'slocation information. Therefore, in this study, ship detection was performed using high-resolution optical satellite images that can periodically remotely detect a wide range and detectsmallships. However, optical images can cause false-alarm due to noise on the surface of the sea, such as waves, or factors indicating ship-like brightness, such as clouds and wakes. So, it is important to remove these factors to improve the accuracy of ship detection. In this study, false alarm wasreduced, and the accuracy ofship detection wasimproved by removing wake.As a ship detection method, ship detection was performed using machine learning-based random forest (RF), and convolutional neural network (CNN) techniquesthat have been widely used in object detection fieldsrecently, and ship detection results by the model were compared and analyzed. In addition, in this study, the results of RF and CNN were combined to improve the phenomenon of ship disconnection and the phenomenon of small detection. The ship detection results of thisstudy are significant in that they improved the limitations of each model while maintaining accuracy. In addition, if satellite images with improved spatial resolution are utilized in the future, it is expected that ship and wake simultaneous detection with higher accuracy will be performed.

Characteristics of Distribution of Phytoplankton Communities in Three Estuarial Lakes of the Yeongsan River (영산강 하구역에 위치한 세 호수의 식물플랑크톤 군집 분포 특성)

  • Cho, Hyeon Jin;Na, Jeong Eun;Lee, Gun Ju;Lee, Hak Young
    • Korean Journal of Ecology and Environment
    • /
    • v.54 no.4
    • /
    • pp.291-302
    • /
    • 2021
  • The phytoplankton community in the estuarine system is affected by changes of physicochemical factors easily. The present study analyzed phytoplankton community distribution and similarity, in addition to exploring factors influencing variations in phytoplankton community structure in three lakes located in the Yeongsan River estuary from March 2014 to November 2017. We carried out non-multidimensional scaling (NMDS) and random forest analysis (RF) for comparing the pattern of phytoplankton distribution and the relationship between phytoplankton distribution and environmental variables. Similarity Percentage (SIMPER) and Analysis of Similarity (ANOSIM) were performed to figure out the similarity of phytoplankton community at each site of three lakes. From NMDS, Phytoplankton community distribution differed between Yeongsan and Gumho lakes, and the factors influencing the distribution of phytoplankton communities across the three lakes were water temperature, dissolved oxygen, total nitrogen (T-N), nitrate-N (NO3-N), and conductivity. NO3-N was a key factor influencing phytoplankton community structure in the three lakes based on RF. A total of 24 species were identified as indicator species in the three lakes studied, with the highest species numbers observed in Yeongsan Lake (13) and the lowest observed in Yeongam Lake (2). According to SIMPER and ANOSIM results, the phytoplankton community in Yeongsan and Yeongam lakes were similar, and they differed from those in Gumho Lake. In addition, the phytoplankton community structure varied across the study sites in the three lakes, indicating that water channels across the lakes a minor influence phytoplankton community distribution.

Improved Estimation of Hourly Surface Ozone Concentrations using Stacking Ensemble-based Spatial Interpolation (스태킹 앙상블 모델을 이용한 시간별 지상 오존 공간내삽 정확도 향상)

  • KIM, Ye-Jin;KANG, Eun-Jin;CHO, Dong-Jin;LEE, Si-Woo;IM, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.3
    • /
    • pp.74-99
    • /
    • 2022
  • Surface ozone is produced by photochemical reactions of nitrogen oxides(NOx) and volatile organic compounds(VOCs) emitted from vehicles and industrial sites, adversely affecting vegetation and the human body. In South Korea, ozone is monitored in real-time at stations(i.e., point measurements), but it is difficult to monitor and analyze its continuous spatial distribution. In this study, surface ozone concentrations were interpolated to have a spatial resolution of 1.5km every hour using the stacking ensemble technique, followed by a 5-fold cross-validation. Base models for the stacking ensemble were cokriging, multi-linear regression(MLR), random forest(RF), and support vector regression(SVR), while MLR was used as the meta model, having all base model results as additional input variables. The results showed that the stacking ensemble model yielded the better performance than the individual base models, resulting in an averaged R of 0.76 and RMSE of 0.0065ppm during the study period of 2020. The surface ozone concentration distribution generated by the stacking ensemble model had a wider range with a spatial pattern similar with terrain and urbanization variables, compared to those by the base models. Not only should the proposed model be capable of producing the hourly spatial distribution of ozone, but it should also be highly applicable for calculating the daily maximum 8-hour ozone concentrations.