• Title/Summary/Keyword: Bias

Search Result 6,501, Processing Time 0.035 seconds

Application of multiple linear regression and artificial neural network models to forecast long-term precipitation in the Geum River basin (다중회귀모형과 인공신경망모형을 이용한 금강권역 강수량 장기예측)

  • Kim, Chul-Gyum;Lee, Jeongwoo;Lee, Jeong Eun;Kim, Hyeonjun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.10
    • /
    • pp.723-736
    • /
    • 2022
  • In this study, monthly precipitation forecasting models that can predict up to 12 months in advance were constructed for the Geum River basin, and two statistical techniques, multiple linear regression (MLR) and artificial neural network (ANN), were applied to the model construction. As predictor candidates, a total of 47 climate indices were used, including 39 global climate patterns provided by the National Oceanic and Atmospheric Administration (NOAA) and 8 meteorological factors for the basin. Forecast models were constructed by using climate indices with high correlation by analyzing the teleconnection between the monthly precipitation and each climate index for the past 40 years based on the forecast month. In the goodness-of-fit test results for the average value of forecasts of each month for 1991 to 2021, the MLR models showed -3.3 to -0.1% for the percent bias (PBIAS), 0.45 to 0.50 for the Nash-Sutcliffe efficiency (NSE), and 0.69 to 0.70 for the Pearson correlation coefficient (r), whereas, the ANN models showed PBIAS -5.0~+0.5%, NSE 0.35~0.47, and r 0.64~0.70. The mean values predicted by the MLR models were found to be closer to the observation than the ANN models. The probability of including observations within the forecast range for each month was 57.5 to 83.6% (average 72.9%) for the MLR models, and 71.5 to 88.7% (average 81.1%) for the ANN models, indicating that the ANN models showed better results. The tercile probability by month was 25.9 to 41.9% (average 34.6%) for the MLR models, and 30.3 to 39.1% (average 34.7%) for the ANN models. Both models showed long-term predictability of monthly precipitation with an average of 33.3% or more in tercile probability. In conclusion, the difference in predictability between the two models was found to be relatively small. However, when judging from the hit rate for the prediction range or the tercile probability, the monthly deviation for predictability was found to be relatively small for the ANN models.

GMI Microwave Sea Surface Temperature Validation and Environmental Factors in the Seas around Korean Peninsula (한반도 주변해 GMI 마이크로파 해수면온도 검증과 환경적 요인)

  • Kim, Hee-Young;Park, Kyung-Ae;Kwak, Byeong-Dae;Joo, Hui-Tae;Lee, Joon-Soo
    • Journal of the Korean earth science society
    • /
    • v.43 no.5
    • /
    • pp.604-617
    • /
    • 2022
  • Sea surface temperature (SST) is a key variable that can be used to understand ocean-atmosphere phenomena and predict climate change. Satellite microwave remote sensing enables the measurement of SST despite the presence of clouds and precipitation in the sensor path. Therefore, considering the high utilization of microwave SST, it is necessary to continuously verify its accuracy and analyze its error characteristics. In this study, the validation of the microwave global precision measurement (GPM)/GPM microwave imager (GMI) SST around the Northwest Pacific and Korean Peninsula was conducted using surface drifter temperature data for approximately eight years from March 2014 to December 2021. The GMI SST showed a bias of 0.09K and an average root mean square error of 0.97K compared to the actual SST, which was slightly higher than that observed in previous studies. In addition, the error characteristics of the GMI SST were related to environmental factors, such as latitude, distance from the coast, sea wind, and water vapor volume. Errors tended to increase in areas close to coastal areas within 300 km of land and in high-latitude areas. In addition, relatively high errors were found in the range of weak wind speeds (<6 m s-1) during the day and strong wind speeds (>10 m s-1) at night. Atmospheric water vapor contributed to high SST differences in very low ranges of <30 mm and in very high ranges of >60 mm. These errors are consistent with those observed in previous studies, in which GMI data were less accurate at low SST and were estimated to be due to differences in land and ocean radiation, wind-induced changes in sea surface roughness, and absorption of water vapor into the microwave atmosphere. These results suggest that the characteristics of the GMI SST differences should be clarified for more extensive use of microwave satellite SST calculations in the seas around the Korean Peninsula, including a part of the Northwest Pacific.

Generation of Daily High-resolution Sea Surface Temperature for the Seas around the Korean Peninsula Using Multi-satellite Data and Artificial Intelligence (다종 위성자료와 인공지능 기법을 이용한 한반도 주변 해역의 고해상도 해수면온도 자료 생산)

  • Jung, Sihun;Choo, Minki;Im, Jungho;Cho, Dongjin
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_2
    • /
    • pp.707-723
    • /
    • 2022
  • Although satellite-based sea surface temperature (SST) is advantageous for monitoring large areas, spatiotemporal data gaps frequently occur due to various environmental or mechanical causes. Thus, it is crucial to fill in the gaps to maximize its usability. In this study, daily SST composite fields with a resolution of 4 km were produced through a two-step machine learning approach using polar-orbiting and geostationary satellite SST data. The first step was SST reconstruction based on Data Interpolate Convolutional AutoEncoder (DINCAE) using multi-satellite-derived SST data. The second step improved the reconstructed SST targeting in situ measurements based on light gradient boosting machine (LGBM) to finally produce daily SST composite fields. The DINCAE model was validated using random masks for 50 days, whereas the LGBM model was evaluated using leave-one-year-out cross-validation (LOYOCV). The SST reconstruction accuracy was high, resulting in R2 of 0.98, and a root-mean-square-error (RMSE) of 0.97℃. The accuracy increase by the second step was also high when compared to in situ measurements, resulting in an RMSE decrease of 0.21-0.29℃ and an MAE decrease of 0.17-0.24℃. The SST composite fields generated using all in situ data in this study were comparable with the existing data assimilated SST composite fields. In addition, the LGBM model in the second step greatly reduced the overfitting, which was reported as a limitation in the previous study that used random forest. The spatial distribution of the corrected SST was similar to those of existing high resolution SST composite fields, revealing that spatial details of oceanic phenomena such as fronts, eddies and SST gradients were well simulated. This research demonstrated the potential to produce high resolution seamless SST composite fields using multi-satellite data and artificial intelligence.

A Checklist to Improve the Fairness in AI Financial Service: Focused on the AI-based Credit Scoring Service (인공지능 기반 금융서비스의 공정성 확보를 위한 체크리스트 제안: 인공지능 기반 개인신용평가를 중심으로)

  • Kim, HaYeong;Heo, JeongYun;Kwon, Hochang
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.259-278
    • /
    • 2022
  • With the spread of Artificial Intelligence (AI), various AI-based services are expanding in the financial sector such as service recommendation, automated customer response, fraud detection system(FDS), credit scoring services, etc. At the same time, problems related to reliability and unexpected social controversy are also occurring due to the nature of data-based machine learning. The need Based on this background, this study aimed to contribute to improving trust in AI-based financial services by proposing a checklist to secure fairness in AI-based credit scoring services which directly affects consumers' financial life. Among the key elements of trustworthy AI like transparency, safety, accountability, and fairness, fairness was selected as the subject of the study so that everyone could enjoy the benefits of automated algorithms from the perspective of inclusive finance without social discrimination. We divided the entire fairness related operation process into three areas like data, algorithms, and user areas through literature research. For each area, we constructed four detailed considerations for evaluation resulting in 12 checklists. The relative importance and priority of the categories were evaluated through the analytic hierarchy process (AHP). We use three different groups: financial field workers, artificial intelligence field workers, and general users which represent entire financial stakeholders. According to the importance of each stakeholder, three groups were classified and analyzed, and from a practical perspective, specific checks such as feasibility verification for using learning data and non-financial information and monitoring new inflow data were identified. Moreover, financial consumers in general were found to be highly considerate of the accuracy of result analysis and bias checks. We expect this result could contribute to the design and operation of fair AI-based financial services.

Development of Diameter Distribution Change and Site Index in a Stand of Robinia pseudoacacia, a Major Honey Plant (꿀샘식물 아까시나무의 지위지수 도출 및 직경분포 변화)

  • Kim, Sora;Song, Jungeun;Park, Chunhee;Min, Suhui;Hong, Sunghee;Yun, Junhyuk;Son, Yeongmo
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.2
    • /
    • pp.311-318
    • /
    • 2022
  • We conducted this study to derive the site index, which is a criterion for the planting of Robinia pseudoacacia, a honey plant, and to investigate the diameter distribution change by derived site index. We applied the Chapman-Richards equation model to estimate the site index of the Robinia pseudoacacia stand. The site index was distributed within the range of 16-22 when the base age was 30 years. The fitness index of the site index estimation model was low, but we judged that there was no problem in the application because the residual distribution of the equation had not shifted to one side. We used the Weibull diameter distribution function to determine the diameter distribution of the Robinia pseudoacacia stand by site index. We used the mean diameter and the dominant tree height as independent variables to present the diameter distribution, and our analysis procedure was to estimate and recover the parameters of the Weibull diameter distribution function. We used the mean diameter and the dominant tree height of the Robinia pseudoacacia stand to show distribution by diameter class, and the fitness index for dbh distribution estimation was about 80.5%. As a result of schematizing the diameter distribution by site indices as a 30-year-old, we found that the higher the site index, the more the curve of the diameter distribution moved to the right. This suggests that if the plantation were to be established in a high site index stand, considering the suitable trees on the site, the growth of Robinia pseudoacacia woul d become active, and not onl y the production of wood but al so the production of honey would increase. We therefore anticipate that the site index classification table and curve of this Robinia pseudoacacia stand will become the standard for decision making in the plantation and management of this tree.

Estimation of Stem Taper Equations and Stem Volume Table for Phyllostachys pubescens Mazel in South Korea (맹종죽의 수간곡선식 및 수간재적표 추정)

  • Eun-Ji, Bae;Yeong-Mo, Son;Jin-Taek, Kang
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.4
    • /
    • pp.622-629
    • /
    • 2022
  • The study aim was to derive a stem taper equation for Phyllostachys pubescens, a type of bamboo in South Korea, and to develop a stem volume table. To derive the stem taper equation, three stem taper models (Max & Burkhart, Kozak, and Lee) were used. Since bamboo stalks are hollow because of its woody characteristics, the outer and inner diameters of the tree were calculated, and connecting them enabled estimating the tree curves. The results of the three equations for estimating the outer and inner diameters led to selection of the Kozak model for determining the optimal stem taper because it had the highest fitness index and lowest error and bias. We used the Kozak model to estimate the diameter of Phyllostachys pubescens by stem height, which proved optimal, and drew the stem curve. After checking the residual degree in the stem taper equation, all residuals were distributed around "0", which proved the suitability of the equation. To calculate the stem volume of Phyllostachys pubescens, a rotating cube was created by rotating the stem curve with the outer diameter at 360°, and the volume was calculated by applying Smalian's method. The volume of Phyllostachys pubescens was calculated by deducting the inner diameter calculated volume from the outer diameter calculated volume. The volume of Phyllostachys pubescens was only 20~30% of the volume of Larix kaempferi, which is a general species. However, considering the current trees/ha of Phyllostachys pubescens and the amount of bamboo shoots generated every year, the individual tree volume was predicted to be small, but the volume/ha was not very different or perhaps more. The significance of this study is the stem taper equation and stem volume table for Phyllostachys pubescens developed for the first time in South Korea. The results are expected to be used as basic data for bamboo trading that is in increasing public and industrial demand and carbon absorption estimation.

A preliminary assessment of high-spatial-resolution satellite rainfall estimation from SAR Sentinel-1 over the central region of South Korea (한반도 중부지역에서의 SAR Sentinel-1 위성강우량 추정에 관한 예비평가)

  • Nguyen, Hoang Hai;Jung, Woosung;Lee, Dalgeun;Shin, Daeyun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.6
    • /
    • pp.393-404
    • /
    • 2022
  • Reliable terrestrial rainfall observations from satellites at finer spatial resolution are essential for urban hydrological and microscale agricultural demands. Although various traditional "top-down" approach-based satellite rainfall products were widely used, they are limited in spatial resolution. This study aims to assess the potential of a novel "bottom-up" approach for rainfall estimation, the parameterized SM2RAIN model, applied to the C-band SAR Sentinel-1 satellite data (SM2RAIN-S1), to generate high-spatial-resolution terrestrial rainfall estimates (0.01° grid/6-day) over Central South Korea. Its performance was evaluated for both spatial and temporal variability using the respective rainfall data from a conventional reanalysis product and rain gauge network for a 1-year period over two different sub-regions in Central South Korea-the mixed forest-dominated, middle sub-region and cropland-dominated, west coast sub-region. Evaluation results indicated that the SM2RAIN-S1 product can capture general rainfall patterns in Central South Korea, and hold potential for high-spatial-resolution rainfall measurement over the local scale with different land covers, while less biased rainfall estimates against rain gauge observations were provided. Moreover, the SM2RAIN-S1 rainfall product was better in mixed forests considering the Pearson's correlation coefficient (R = 0.69), implying the suitability of 6-day SM2RAIN-S1 data in capturing the temporal dynamics of soil moisture and rainfall in mixed forests. However, in terms of RMSE and Bias, better performance was obtained with the SM2RAIN-S1 rainfall product over croplands rather than mixed forests, indicating that larger errors induced by high evapotranspiration losses (especially in mixed forests) need to be included in further improvement of the SM2RAIN.

Changes in body composition, body balance, metabolic parameters and eating behavior among overweight and obese women due to adherence to the Pilates exercise program (과체중·비만인에서 필라테스 운동 순응도에 따른 식생활 변화, 체구성, 신체 균형도 및 대사지표 개선효과)

  • Hyun Ju Kim;Jihyun Park;Mi Ri Ha;Ye Jin Kim;Chaerin Kim;Oh Yoen Kim
    • Journal of Nutrition and Health
    • /
    • v.55 no.6
    • /
    • pp.642-655
    • /
    • 2022
  • Purpose: We examined the effects of the 8-week moderate-intensity Pilates exercise program on body composition, balance ability, metabolic parameters, arterial condition, and eating habits among overweight and obese women. Methods: From the general sample of overweight or obese Korean women (body mass index ≥ 23 kg/m2 ), those who had not been diagnosed with any chronic degenerative diseases were enrolled in the study (n = 39). After 8 weeks of the Pilates exercise program, the participants were subdivided into adherence and non-adherence groups. Among the study participants, 24 women were matched for age and menopausal status to reduce the bias, and then finally included for the comparison (Pilates-adherence, n = 12; Pilates-non-adherence, n = 12). Results: The body balance measured by the Y-balance test, body mass index, and subcutaneous fat areas were significantly improved in both groups. However, the Pilate-sadherence group showed more positive changes in body balance and had significant improvement in body composition parameters such as waist size, visceral fat area, systolic blood pressure, arterial aging index, fasting blood glucose, and glycated hemoglobin than the Pilates-non-adherence group. In addition, the nutrition quotient for Korean adults (balance, moderation, and behavior except diversity) were significantly improved in both groups after dietary education. However, the participants did not show dramatic improvement in the metabolic parameters, because all the study subjects were in relatively good health and did not have any diagnosed diseases. Conclusion: This study demonstrated that higher adherence to the Pilates exercise program together with a modification of eating habits may effectively improve body balance, body composition, and obesity-related parameters among overweight and obese women.

A Study on the Potential Use of ChatGPT in Public Design Policy Decision-Making (공공디자인 정책 결정에 ChatGPT의 활용 가능성에 관한연구)

  • Son, Dong Joo;Yoon, Myeong Han
    • Journal of Service Research and Studies
    • /
    • v.13 no.3
    • /
    • pp.172-189
    • /
    • 2023
  • This study investigated the potential contribution of ChatGPT, a massive language and information model, in the decision-making process of public design policies, focusing on the characteristics inherent to public design. Public design utilizes the principles and approaches of design to address societal issues and aims to improve public services. In order to formulate public design policies and plans, it is essential to base them on extensive data, including the general status of the area, population demographics, infrastructure, resources, safety, existing policies, legal regulations, landscape, spatial conditions, current state of public design, and regional issues. Therefore, public design is a field of design research that encompasses a vast amount of data and language. Considering the rapid advancements in artificial intelligence technology and the significance of public design, this study aims to explore how massive language and information models like ChatGPT can contribute to public design policies. Alongside, we reviewed the concepts and principles of public design, its role in policy development and implementation, and examined the overview and features of ChatGPT, including its application cases and preceding research to determine its utility in the decision-making process of public design policies. The study found that ChatGPT could offer substantial language information during the formulation of public design policies and assist in decision-making. In particular, ChatGPT proved useful in providing various perspectives and swiftly supplying information necessary for policy decisions. Additionally, the trend of utilizing artificial intelligence in government policy development was confirmed through various studies. However, the usage of ChatGPT also unveiled ethical, legal, and personal privacy issues. Notably, ethical dilemmas were raised, along with issues related to bias and fairness. To practically apply ChatGPT in the decision-making process of public design policies, first, it is necessary to enhance the capacities of policy developers and public design experts to a certain extent. Second, it is advisable to create a provisional regulation named 'Ordinance on the Use of AI in Policy' to continuously refine the utilization until legal adjustments are made. Currently, implementing these two strategies is deemed necessary. Consequently, employing massive language and information models like ChatGPT in the public design field, which harbors a vast amount of language, holds substantial value.

Evaluation of bias and uncertainty in snow depth reanalysis data over South Korea (한반도 적설심 재분석자료의 오차 및 불확실성 평가)

  • Jeon, Hyunho;Lee, Seulchan;Lee, Yangwon;Kim, Jinsoo;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.9
    • /
    • pp.543-551
    • /
    • 2023
  • Snow is an essential climate factor that affects the climate system and surface energy balance, and it also has a crucial role in water balance by providing solid water stored during the winter for spring runoff and groundwater recharge. In this study, statistical analysis of Local Data Assimilation and Prediction System (LDAPS), Modern.-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), and ERA5-Land snow depth data were used to evaluate the applicability in South Korea. The statistical analysis between the Automated Synoptic Observing System (ASOS) ground observation data provided by the Korea Meteorological Administration (KMA) and the reanalysis data showed that LDAPS and ERA5-Land were highly correlated with a correlation coefficient of more than 0.69, but LDAPS showed a large error with an RMSE of 0.79 m. In the case of MERRA-2, the correlation coefficient was lower at 0.17 because the constant value was estimated continuously for some periods, which did not adequately simulate the increase and decrease trend between data. The statistical analysis of LDAPS and ASOS showed high and low performance in the nearby Gangwon Province, where the average snowfall is relatively high, and in the southern region, where the average snowfall is low, respectively. Finally, the error variance between the four independent snow depth data used in this study was calculated through triple collocation (TC), and a merged snow depth data was produced through weighting factors. The reanalyzed data showed the highest error variance in the order of LDAPS, MERRA-2, and ERA5-Land, and LDAPS was given a lower weighting factor due to its higher error variance. In addition, the spatial distribution of ERA5-Land snow depth data showed less variability, so the TC-merged snow depth data showed a similar spatial distribution to MERRA-2, which has a low spatial resolution. Considering the correlation, error, and uncertainty of the data, the ERA5-Land data is suitable for snow-related analysis in South Korea. In addition, it is expected that LDAPS data, which is highly correlated with other data but tends to be overestimated, can be actively utilized for high-resolution representation of regional and climatic diversity if appropriate corrections are performed.