• Title/Summary/Keyword: root mean square error

Search Result 1,203, Processing Time 0.038 seconds

Comparative Assessment of Linear Regression and Machine Learning for Analyzing the Spatial Distribution of Ground-level NO2 Concentrations: A Case Study for Seoul, Korea (서울 지역 지상 NO2 농도 공간 분포 분석을 위한 회귀 모델 및 기계학습 기법 비교)

  • Kang, Eunjin;Yoo, Cheolhee;Shin, Yeji;Cho, Dongjin;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_1
    • /
    • pp.1739-1756
    • /
    • 2021
  • Atmospheric nitrogen dioxide (NO2) is mainly caused by anthropogenic emissions. It contributes to the formation of secondary pollutants and ozone through chemical reactions, and adversely affects human health. Although ground stations to monitor NO2 concentrations in real time are operated in Korea, they have a limitation that it is difficult to analyze the spatial distribution of NO2 concentrations, especially over the areas with no stations. Therefore, this study conducted a comparative experiment of spatial interpolation of NO2 concentrations based on two linear-regression methods(i.e., multi linear regression (MLR), and regression kriging (RK)), and two machine learning approaches (i.e., random forest (RF), and support vector regression (SVR)) for the year of 2020. Four approaches were compared using leave-one-out-cross validation (LOOCV). The daily LOOCV results showed that MLR, RK, and SVR produced the average daily index of agreement (IOA) of 0.57, which was higher than that of RF (0.50). The average daily normalized root mean square error of RK was 0.9483%, which was slightly lower than those of the other models. MLR, RK and SVR showed similar seasonal distribution patterns, and the dynamic range of the resultant NO2 concentrations from these three models was similar while that from RF was relatively small. The multivariate linear regression approaches are expected to be a promising method for spatial interpolation of ground-level NO2 concentrations and other parameters in urban areas.

Estimation of TROPOMI-derived Ground-level SO2 Concentrations Using Machine Learning Over East Asia (기계학습을 활용한 동아시아 지역의 TROPOMI 기반 SO2 지상농도 추정)

  • Choi, Hyunyoung;Kang, Yoojin;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.2
    • /
    • pp.275-290
    • /
    • 2021
  • Sulfur dioxide (SO2) in the atmosphere is mainly generated from anthropogenic emission sources. It forms ultra-fine particulate matter through chemical reaction and has harmful effect on both the environment and human health. In particular, ground-level SO2 concentrations are closely related to human activities. Satellite observations such as TROPOMI (TROPOspheric Monitoring Instrument)-derived column density data can provide spatially continuous monitoring of ground-level SO2 concentrations. This study aims to propose a 2-step residual corrected model to estimate ground-level SO2 concentrations through the synergistic use of satellite data and numerical model output. Random forest machine learning was adopted in the 2-step residual corrected model. The proposed model was evaluated through three cross-validations (i.e., random, spatial and temporal). The results showed that the model produced slopes of 1.14-1.25, R values of 0.55-0.65, and relative root-mean-square-error of 58-63%, which were improved by 10% for slopes and 3% for R and rRMSE when compared to the model without residual correction. The model performance by country was slightly reduced in Japan, often resulting in overestimation, where the sample size was small, and the concentration level was relatively low. The spatial and temporal distributions of SO2 produced by the model agreed with those of the in-situ measurements, especially over Yangtze River Delta in China and Seoul Metropolitan Area in South Korea, which are highly dependent on the characteristics of anthropogenic emission sources. The model proposed in this study can be used for long-term monitoring of ground-level SO2 concentrations on both the spatial and temporal domains.

Estimation of Surface fCO2 in the Southwest East Sea using Machine Learning Techniques (기계학습법을 이용한 동해 남서부해역의 표층 이산화탄소분압(fCO2) 추정)

  • HAHM, DOSHIK;PARK, SOYEONA;CHOI, SANG-HWA;KANG, DONG-JIN;RHO, TAEKEUN;LEE, TONGSUP
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.3
    • /
    • pp.375-388
    • /
    • 2019
  • Accurate evaluation of sea-to-air $CO_2$ flux and its variability is crucial information to the understanding of global carbon cycle and the prediction of atmospheric $CO_2$ concentration. $fCO_2$ observations are sparse in space and time in the East Sea. In this study, we derived high resolution time series of surface $fCO_2$ values in the southwest East Sea, by feeding sea surface temperature (SST), salinity (SSS), chlorophyll-a (CHL), and mixed layer depth (MLD) values, from either satellite-observations or numerical model outputs, to three machine learning models. The root mean square error of the best performing model, a Random Forest (RF) model, was $7.1{\mu}atm$. Important parameters in predicting $fCO_2$ in the RF model were SST and SSS along with time information; CHL and MLD were much less important than the other parameters. The net $CO_2$ flux in the southwest East Sea, calculated from the $fCO_2$ predicted by the RF model, was $-0.76{\pm}1.15mol\;m^{-2}yr^{-1}$, close to the lower bound of the previous estimates in the range of $-0.66{\sim}-2.47mol\;m^{-2}yr^{-1}$. The time series of $fCO_2$ predicted by the RF model showed a significant variation even in a short time interval of a week. For accurate evaluation of the $CO_2$ flux in the Ulleung Basin, it is necessary to conduct high resolution in situ observations in spring when $fCO_2$ changes rapidly.

Study on the Concentration Estimation Equation of Nitrogen Dioxide using Hyperspectral Sensor (초분광센서를 활용한 이산화질소 농도 추정식에 관한 연구)

  • Jeon, Eui-Ik;Park, Jin-Woo;Lim, Seong-Ha;Kim, Dong-Woo;Yu, Jae-Jin;Son, Seung-Woo;Jeon, Hyung-Jin;Yoon, Jeong-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.6
    • /
    • pp.19-25
    • /
    • 2019
  • The CleanSYS(Clean SYStem) is operated to monitor air pollutants emitted from specific industrial complexes in Korea. So the industrial complexes without the system are directly monitored by the control officers. For efficient monitoring, studies using various sensors have been conducted to monitor air pollutants emitted from industrial complex. In this study, hyperspectral sensors were used to model and verify the equations for estimating the concentration of $NO_2$(nitrogen dioxide) in air pollutants emitted. For development of the equations, spectral radiance were observed for $NO_2$ at various concentrations with different SZA(Solar Zenith Angle), VZA(Viewing Zenith Angle), and RAA(Relative Azimuth Angle). From the observed spectral radiance, the calculated value of the difference between the values of the specific wavelengths was taken as an absorption depth, and the equations were developed using the relationship between the depth and the $NO_2$ concentration. The spectral radiance mixed gas of $NO_2$ and $SO_2$(sulfur dioxide) was used to verify the equations. As a result, the $R^2$(coefficient of determination) and RMSE(Root Mean Square Error) were different from 0.71~0.88 and 72~23 ppm according to the form of the equation, and $R^2$ of the exponential form was the highest among the equations. Depending on the type of the equations, the accuracy of the estimated concentration with varying concentrations is not constant. However, if the equations are advanced in the future, hyperspectral sensors can be used to monitor the $NO_2$ emitted from the industrial complex.

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

Mediation analysis of dietary habits, nutrient intakes, daily life in the relationship between working hours of Korean shift workers and metabolic syndrome : the sixth (2013 ~ 2015) Korea National Health and Nutrition Examination Survey (교대근무자의 근무시간과 대사증후군의 관계에서 식습관, 영양섭취상태, 일상생활의 매개효과 분석 : 6기 국민건강영양조사 (2013 ~ 2015) 데이터 이용)

  • Kim, Yoona;Kim, Hyeon Hee;Lim, Dong Hoon
    • Journal of Nutrition and Health
    • /
    • v.51 no.6
    • /
    • pp.567-579
    • /
    • 2018
  • Purpose: This study examined the mediation effects of dietary habits, nutrient intake, daily life in the relationship between the working hours of Korean shift workers and metabolic syndrome. Methods: Data were collected from the sixth (2013-2015) Korea National Health and Nutrition Examination Survey (KNHANES). The stochastic regression imputation was used to fill missing data. Statistical analysis was performed in Korean shift workers with metabolic syndrome using the SPSS 24 program for Windows and a structural equation model (SEM) using an analysis of moment structure (AMOS) 21.0 package. Results: The model fitted the data well in terms of the goodness of fit index (GFI) = 0.939, root mean square error of approximation (RMSEA) = 0.025, normed fit index (NFI) = 0.917, Tucker-Lewis index (TLI) = 0.984, comparative fit index (CFI) = 0.987, and adjusted goodness of fit index (AGFI) = 0.915. Specific mediation effect of dietary habits (p = 0.023) was statistically significant in the impact of the working hours of shift workers on nutrient intake, and specific mediation effect of daily life (p = 0.019) was statistically significant in the impact of the working hours of shift workers on metabolic syndrome. On the other hand, the dietary habits, nutrient intake and daily life had no significant multiple mediator effects on the working hours of shift workers with metabolic syndrome. Conclusion: The appropriate model suggests that working hours have direct effect on the daily life, which has the mediation effect on the risk of metabolic syndrome in shift workers.

Estimation of Soil Moisture Using Sentinel-1 SAR Images and Multiple Linear Regression Model Considering Antecedent Precipitations (선행 강우를 고려한 Sentinel-1 SAR 위성영상과 다중선형회귀모형을 활용한 토양수분 산정)

  • Chung, Jeehun;Son, Moobeen;Lee, Yonggwan;Kim, Seongjoon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.515-530
    • /
    • 2021
  • This study is to estimate soil moisture (SM) using Sentinel-1A/B C-band SAR (synthetic aperture radar) images and Multiple Linear Regression Model(MLRM) in the Yongdam-Dam watershed of South Korea. Both the Sentinel-1A and -1B images (6 days interval and 10 m resolution) were collected for 5 years from 2015 to 2019. The geometric, radiometric, and noise corrections were performed using the SNAP (SentiNel Application Platform) software and converted to backscattering coefficient of VV and VH polarization. The in-situ SM data measured at 6 locations using TDR were used to validate the estimated SM results. The 5 days antecedent precipitation data were also collected to overcome the estimation difficulty for the vegetated area not reaching the ground. The MLRM modeling was performed using yearly data and seasonal data set, and correlation analysis was performed according to the number of the independent variable. The estimated SM was verified with observed SM using the coefficient of determination (R2) and the root mean square error (RMSE). As a result of SM modeling using only BSC in the grass area, R2 was 0.13 and RMSE was 4.83%. When 5 days of antecedent precipitation data was used, R2 was 0.37 and RMSE was 4.11%. With the use of dry days and seasonal regression equation to reflect the decrease pattern and seasonal variability of SM, the correlation increased significantly with R2 of 0.69 and RMSE of 2.88%.

Evaluation of stream flow and water quality changes of Yeongsan river basin by inter-basin water transfer using SWAT (SWAT을 이용한 유역간 물이동량에 따른 영산강유역의 하천 유량 및 수질 변동 분석)

  • Kim, Yong Won;Lee, Ji Wan;Woo, So Young;Kim, Seong Joon
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.12
    • /
    • pp.1081-1095
    • /
    • 2020
  • This study is to evaluate stream flow and water quality changes of Yeongsan river basin (3,371.4 km2) by inter-basin water transfer (IBWT) from Juam dam of Seomjin river basin using SWAT (Soil and Water Assessment Tool). The SWAT was established using inlet function for IBWT between donor and receiving basins. The SWAT was calibrated and validated with 14 years (2005 ~ 2018) data of 1 stream (MR) and 2 multi-functional weir (SCW, JSW) water level gauging stations, and 3 water quality stations (GJ2, NJ, and HP) including data of IBWT and effluent from wastewater treatment plants of Yeongsan river basin. For streamflow and weir inflows (MR, SCW, and JSW), the coefficient of determination (R2), Nash-Sutcliffe efficiency (NSE), root mean square error (RMSE), and percent bias (PBIAS) were 0.69 ~ 0.81, 0.61 ~ 0.70, 1.34 ~ 2.60 mm/day, and -8.3% ~ +7.6% respectively. In case of water quality, the R2 of SS, T-N, and T-P were 0.69 ~ 0.81, 0.61 ~ 0.70, and 0.54 ~ 0.63 respectively. The Yeongsan river basin average streamflow was 12.0 m3/sec and the average SS, T-N, and T-P were 110.5 mg/L, 4.4 mg/L, 0.18 mg/L respectively. Under the 130% scenario of IBWT amount, the streamflow, SS increased to 12.94 m3/sec (+7.8%), 111.26 mg/L (+0.7%) and the T-N, T-P decreased to 4.17 mg/L (-5.2%), 0.165 mg/L (-8.3%) respectively. Under the 70% scenario of IBWT amount, the streamflow, SS decreased to 11.07 m3/sec (-7.8%), 109.74 mg/L (-0.7%) and the T-N, T-P increased to 4.68 mg/L (+6.4%), 0.199 mg/L (+10.6%) respectively.

Estimation of the Lodging Area in Rice Using Deep Learning (딥러닝을 이용한 벼 도복 면적 추정)

  • Ban, Ho-Young;Baek, Jae-Kyeong;Sang, Wan-Gyu;Kim, Jun-Hwan;Seo, Myung-Chul
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.66 no.2
    • /
    • pp.105-111
    • /
    • 2021
  • Rice lodging is an annual occurrence caused by typhoons accompanied by strong winds and strong rainfall, resulting in damage relating to pre-harvest sprouting during the ripening period. Thus, rapid estimations of the area of lodged rice are necessary to enable timely responses to damage. To this end, we obtained images related to rice lodging using a drone in Gimje, Buan, and Gunsan, which were converted to 128 × 128 pixels images. A convolutional neural network (CNN) model, a deep learning model based on these images, was used to predict rice lodging, which was classified into two types (lodging and non-lodging), and the images were divided in a 8:2 ratio into a training set and a validation set. The CNN model was layered and trained using three optimizers (Adam, Rmsprop, and SGD). The area of rice lodging was evaluated for the three fields using the obtained data, with the exception of the training set and validation set. The images were combined to give composites images of the entire fields using Metashape, and these images were divided into 128 × 128 pixels. Lodging in the divided images was predicted using the trained CNN model, and the extent of lodging was calculated by multiplying the ratio of the total number of field images by the number of lodging images by the area of the entire field. The results for the training and validation sets showed that accuracy increased with a progression in learning and eventually reached a level greater than 0.919. The results obtained for each of the three fields showed high accuracy with respect to all optimizers, among which, Adam showed the highest accuracy (normalized root mean square error: 2.73%). On the basis of the findings of this study, it is anticipated that the area of lodged rice can be rapidly predicted using deep learning.

Physical Offset of UAVs Calibration Method for Multi-sensor Fusion (다중 센서 융합을 위한 무인항공기 물리 오프셋 검보정 방법)

  • Kim, Cheolwook;Lim, Pyeong-chae;Chi, Junhwa;Kim, Taejung;Rhee, Sooahm
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1125-1139
    • /
    • 2022
  • In an unmanned aerial vehicles (UAVs) system, a physical offset can be existed between the global positioning system/inertial measurement unit (GPS/IMU) sensor and the observation sensor such as a hyperspectral sensor, and a lidar sensor. As a result of the physical offset, a misalignment between each image can be occurred along with a flight direction. In particular, in a case of multi-sensor system, an observation sensor has to be replaced regularly to equip another observation sensor, and then, a high cost should be paid to acquire a calibration parameter. In this study, we establish a precise sensor model equation to apply for a multiple sensor in common and propose an independent physical offset estimation method. The proposed method consists of 3 steps. Firstly, we define an appropriate rotation matrix for our system, and an initial sensor model equation for direct-georeferencing. Next, an observation equation for the physical offset estimation is established by extracting a corresponding point between a ground control point and the observed data from a sensor. Finally, the physical offset is estimated based on the observed data, and the precise sensor model equation is established by applying the estimated parameters to the initial sensor model equation. 4 region's datasets(Jeon-ju, Incheon, Alaska, Norway) with a different latitude, longitude were compared to analyze the effects of the calibration parameter. We confirmed that a misalignment between images were adjusted after applying for the physical offset in the sensor model equation. An absolute position accuracy was analyzed in the Incheon dataset, compared to a ground control point. For the hyperspectral image, root mean square error (RMSE) for X, Y direction was calculated for 0.12 m, and for the point cloud, RMSE was calculated for 0.03 m. Furthermore, a relative position accuracy for a specific point between the adjusted point cloud and the hyperspectral images were also analyzed for 0.07 m, so we confirmed that a precise data mapping is available for an observation without a ground control point through the proposed estimation method, and we also confirmed a possibility of multi-sensor fusion. From this study, we expect that a flexible multi-sensor platform system can be operated through the independent parameter estimation method with an economic cost saving.