• Title/Summary/Keyword: Performance Bias

Search Result 983, Processing Time 0.03 seconds

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.

Software Reliability Growth Modeling in the Testing Phase with an Outlier Stage (하나의 이상구간을 가지는 테스팅 단계에서의 소프트웨어 신뢰도 성장 모형화)

  • Park, Man-Gon;Jung, Eun-Yi
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.10
    • /
    • pp.2575-2583
    • /
    • 1998
  • The productionof the highly relible softwae systems and theirs performance evaluation hae become important interests in the software industry. The software evaluation has been mainly carried out in ternns of both reliability and performance of software system. Software reliability is the probability that no software error occurs for a fixed time interval during software testing phase. These theoretical software reliability models are sometimes unsuitable for the practical testing phase in which a software error at a certain testing stage occurs by causes of the imperfect debugging, abnornal software correction, and so on. Such a certatin software testing stage needs to be considered as an outlying stage. And we can assume that the software reliability does not improve by means of muisance factor in this outlying testing stage. In this paper, we discuss Bavesian software reliability growth modeling and estimation procedure in the presence of an imidentitied outlying software testing stage by the modification of Jehnski Moranda. Also we derive the Bayes estimaters of the software reliability panmeters by the assumption of prior information under the squared error los function. In addition, we evaluate the proposed software reliability growth model with an unidentified outlying stage in an exchangeable model according to the values of nuisance paramether using the accuracy, bias, trend, noise metries as the quantilative evaluation criteria through the compater simulation.

  • PDF

Development of $14"{\times}8.5"$ active matrix flat-panel digital x-ray detector system and Imaging performance (평판 디지털 X-ray 검출기의 개발과 성능 평가에 관한 연구)

  • Park, Ji-Koon;Choi, Jang-Yong;Kang, Sang-Sik;Lee, Dong-Gil;Seok, Dae-Woo;Nam, Sang Hee
    • Journal of radiological science and technology
    • /
    • v.26 no.4
    • /
    • pp.39-46
    • /
    • 2003
  • Digital radiographic systems based on solid-state detectors, commonly referred to as flat-panel detectors, are gaining popularity in clinical practice. Large area, flat panel solid state detectors are being investigated for digital radiography. The purpose of this work was to evaluate the active matrix flat panel digital x-ray detectors in terms of their modulation transfer function (MTF), noise power spectrum (NPS), and detective quantum efficiency (DQE). In this paper, development and evaluation of a selenium-based flat-panel digital x-ray detector are described. The prototype detector has a pixel pitch of $139\;{\mu}m$ and a total active imaging area of $14{\times}8.5\;inch^2$, giving a total 3.9 million pixels. This detector include a x-ray imaging layer of amorphous selenium as a photoconductor which is evaporated in vacuum state on a TFT flat panel, to make signals in proportion to incident x-ray. The film thickness was about $500\;{\mu}m$. To evaluate the imaging performance of the digital radiography(DR) system developed in our group, sensitivity, linearity, the modulation transfer function(MTF), noise power spectrum (NPS) and detective quantum efficiency(DQE) of detector was measured. The measured sensitivity was $4.16{\times}10^6\;ehp/pixel{\cdot}mR$ at the bias field of $10\;V/{\mu}m$ : The beam condition was 41.9\;KeV. Measured MTF at 2.5\;lp/mm was 52%, and the DQE at 1.5\;lp/mm was 75%. And the excellent linearity was showed where the coefficient of determination ($r^2$) is 0.9693.

  • PDF

Modified Traditional Calibration Method of CRNP for Improving Soil Moisture Estimation (산악지형에서의 CRNP를 이용한 토양 수분 측정 개선을 위한 새로운 중성자 강도 교정 방법 검증 및 평가)

  • Cho, Seongkeun;Nguyen, Hoang Hai;Jeong, Jaehwan;Oh, Seungcheol;Choi, Minha
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.5_1
    • /
    • pp.665-679
    • /
    • 2019
  • Mesoscale soil moisture measurement from the promising Cosmic-Ray Neutron Probe (CRNP) is expected to bridge the gap between large scale microwave remote sensing and point-based in-situ soil moisture observations. Traditional calibration based on $N_0$ method is used to convert neutron intensity measured at the CRNP to field scale soil moisture. However, the static calibration parameter $N_0$ used in traditional technique is insufficient to quantify long term soil moisture variation and easily influenced by different time-variant factors, contributing to the high uncertainties in CRNP soil moisture product. Consequently, in this study, we proposed a modified traditional calibration method, so-called Dynamic-$N_0$ method, which take into account the temporal variation of $N_0$ to improve the CRNP based soil moisture estimation. In particular, a nonlinear regression method has been developed to directly estimate the time series of $N_0$ data from the corrected neutron intensity. The $N_0$ time series were then reapplied to generate the soil moisture. We evaluated the performance of Dynamic-$N_0$ method for soil moisture estimation compared with the traditional one by using a weighted in-situ soil moisture product. The results indicated that Dynamic-$N_0$ method outperformed the traditional calibration technique, where correlation coefficient increased from 0.70 to 0.72 and RMSE and bias reduced from 0.036 to 0.026 and -0.006 to $-0.001m^3m^{-3}$. Superior performance of the Dynamic-$N_0$ calibration method revealed that the temporal variability of $N_0$ was caused by hydrogen pools surrounding the CRNP. Although several uncertainty sources contributed to the variation of $N_0$ were not fully identified, this proposed calibration method gave a new insight to improve field scale soil moisture estimation from the CRNP.

A preliminary assessment of high-spatial-resolution satellite rainfall estimation from SAR Sentinel-1 over the central region of South Korea (한반도 중부지역에서의 SAR Sentinel-1 위성강우량 추정에 관한 예비평가)

  • Nguyen, Hoang Hai;Jung, Woosung;Lee, Dalgeun;Shin, Daeyun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.6
    • /
    • pp.393-404
    • /
    • 2022
  • Reliable terrestrial rainfall observations from satellites at finer spatial resolution are essential for urban hydrological and microscale agricultural demands. Although various traditional "top-down" approach-based satellite rainfall products were widely used, they are limited in spatial resolution. This study aims to assess the potential of a novel "bottom-up" approach for rainfall estimation, the parameterized SM2RAIN model, applied to the C-band SAR Sentinel-1 satellite data (SM2RAIN-S1), to generate high-spatial-resolution terrestrial rainfall estimates (0.01° grid/6-day) over Central South Korea. Its performance was evaluated for both spatial and temporal variability using the respective rainfall data from a conventional reanalysis product and rain gauge network for a 1-year period over two different sub-regions in Central South Korea-the mixed forest-dominated, middle sub-region and cropland-dominated, west coast sub-region. Evaluation results indicated that the SM2RAIN-S1 product can capture general rainfall patterns in Central South Korea, and hold potential for high-spatial-resolution rainfall measurement over the local scale with different land covers, while less biased rainfall estimates against rain gauge observations were provided. Moreover, the SM2RAIN-S1 rainfall product was better in mixed forests considering the Pearson's correlation coefficient (R = 0.69), implying the suitability of 6-day SM2RAIN-S1 data in capturing the temporal dynamics of soil moisture and rainfall in mixed forests. However, in terms of RMSE and Bias, better performance was obtained with the SM2RAIN-S1 rainfall product over croplands rather than mixed forests, indicating that larger errors induced by high evapotranspiration losses (especially in mixed forests) need to be included in further improvement of the SM2RAIN.

An Analysis of the Government Officer's Understanding on Landscape Law and Institutions (경관제도에 대한 경관담당 공무원 인식조사)

  • Joo, Shin-Ha
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.45 no.3
    • /
    • pp.54-65
    • /
    • 2017
  • The purpose of this study is to investigate the perception of landscape law and institutions and to provide basic data for improvement of landscape systems. Specifically, we analyzed the importance and achievement of various landscape systems, and examined the understanding and perception of government officers in landscape plan, landscape project, landscape agreement, landscape reviews and landscape committees, landscape ordinance, and landscape administration. The main results of the study are summarized as follows. 1. Overall, the landscape administration system was highly interested, and it was also positive about the utility of the landscape law and the landscape charter. As a result of analysis of the IPA, the landscape plan and the landscape policy plan need to be intensively improved. 2. The landscape plan is mostly used for the purpose of responding to the scenery review or complaint request, but about 10.8% of respondents said that they did not refer it at all, so it is urgent to make the contents of the landscape plan real and improve the performance. Although many officers thought that less than 18 months would be quite enough for landscape plans, but it is necessary to change this duration issue. 3. In order to improve landscape projects and landscape agreements, it seems that budget securing, experts, and promotional organizations should be improved first. 4. It is urgently necessary to enhance the understanding about overall landscape law and systems of landscape review committee in order to supplement the landscape review and the landscape committee. 5. Administrative support such as personnel recruitment is required for landscape ordinance and landscape administration, and it is also found that many officers also have a great burden in making subjective judgment as the person in charge. There could be a positive bias in the results of the study, because the survey was conducted only for public officials who participated in the education. But the result will be helpful to look at the overall tendency of the landscape system. I hope that it will help improve the landscape system in the future much more realistic.

Downscaling of Sunshine Duration for a Complex Terrain Based on the Shaded Relief Image and the Sky Condition (하늘상태와 음영기복도에 근거한 복잡지형의 일조시간 분포 상세화)

  • Kim, Seung-Ho;Yun, Jin I.
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.18 no.4
    • /
    • pp.233-241
    • /
    • 2016
  • Experiments were carried out to quantify the topographic effects on attenuation of sunshine in complex terrain and the results are expected to help convert the coarse resolution sunshine duration information provided by the Korea Meteorological Administration (KMA) into a detailed map reflecting the terrain characteristics of mountainous watershed. Hourly shaded relief images for one year, each pixel consisting of 0 to 255 brightness value, were constructed by applying techniques of shadow modeling and skyline analysis to the 3m resolution digital elevation model for an experimental watershed on the southern slope of Mt. Jiri in Korea. By using a bimetal sunshine recorder, sunshine duration was measured at three points with different terrain conditions in the watershed from May 15, 2015 to May 14, 2016. The brightness values of the 3 corresponding pixel points on the shaded relief map were extracted and regressed to the measured sunshine duration, resulting in a brightness-sunshine duration response curve for a clear day. We devised a method to calibrate this curve equation according to sky condition categorized by cloud amount and used it to derive an empirical model for estimating sunshine duration over a complex terrain. When the performance of this model was compared with a conventional scheme for estimating sunshine duration over a horizontal plane, the estimation bias was improved remarkably and the root mean square error for daily sunshine hour was 1.7hr, which is a reduction by 37% from the conventional method. In order to apply this model to a given area, the clear-sky sunshine duration of each pixel should be produced on hourly intervals first, by driving the curve equation with the hourly shaded relief image of the area. Next, the cloud effect is corrected by 3-hourly 'sky condition' of the KMA digital forecast products. Finally, daily sunshine hour can be obtained by accumulating the hourly sunshine duration. A detailed sunshine duration distribution of 3m horizontal resolution was obtained by applying this procedure to the experimental watershed.

A Evaluation of Direct Payment on Agricultural Income effect using Farm Manager Registration Information (농업경영체 등록정보를 활용한 농업직불제 소득효과 분석)

  • Han, Suk-Ho;Chae, Gwang-Seok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.5
    • /
    • pp.195-202
    • /
    • 2016
  • The government has run and managed various forms of direct payment systems, such as the paddy and field direct payment, to ease the instability of farm incomes with respect to market opening, and preserve farm income. Direct payments to the agricultural sector is a center in the key policy instrument that plays an important role in income stabilization. Despite the large amount of spending in the farm unit, the status of direct payment, and policy effects the analysis of direct payments, such as stability of income contribution, are insufficient. This paper, using the farm unit DB in 2014 and 2015, performed farm level analysis of direct payment, and derived the implications of the performance evaluation system. As a result, the distribution of direct payment showed considerable bias to the left side compared to the normal distribution curve. Approximately half of the farms (49.3%) in 2014 DB should receive below 100,000 won per year by a direct payment. A larger-scale farm showed a significantly increased income effect and income stabilizing effect because direct payments make higher contributions to farm income in proportional to the area. In the more elderly farmers, a high contribution by direct payment to farm income was found to be an advantage; however, in small-scale farms of less than 0.5ha, direct payment contribution on farm household income was only 3%. In large-scale farms, 10ha or more, the contribution to farm income were found to be 29.4%. The income of large farms was 10 times larger than small farmers, and the direct payment entitlements that were received were 110 times larger. Through this policy, direct payments are required for future improvements and modifications.

A Study on Training Dataset Configuration for Deep Learning Based Image Matching of Multi-sensor VHR Satellite Images (다중센서 고해상도 위성영상의 딥러닝 기반 영상매칭을 위한 학습자료 구성에 관한 연구)

  • Kang, Wonbin;Jung, Minyoung;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1505-1514
    • /
    • 2022
  • Image matching is a crucial preprocessing step for effective utilization of multi-temporal and multi-sensor very high resolution (VHR) satellite images. Deep learning (DL) method which is attracting widespread interest has proven to be an efficient approach to measure the similarity between image pairs in quick and accurate manner by extracting complex and detailed features from satellite images. However, Image matching of VHR satellite images remains challenging due to limitations of DL models in which the results are depending on the quantity and quality of training dataset, as well as the difficulty of creating training dataset with VHR satellite images. Therefore, this study examines the feasibility of DL-based method in matching pair extraction which is the most time-consuming process during image registration. This paper also aims to analyze factors that affect the accuracy based on the configuration of training dataset, when developing training dataset from existing multi-sensor VHR image database with bias for DL-based image matching. For this purpose, the generated training dataset were composed of correct matching pairs and incorrect matching pairs by assigning true and false labels to image pairs extracted using a grid-based Scale Invariant Feature Transform (SIFT) algorithm for a total of 12 multi-temporal and multi-sensor VHR images. The Siamese convolutional neural network (SCNN), proposed for matching pair extraction on constructed training dataset, proceeds with model learning and measures similarities by passing two images in parallel to the two identical convolutional neural network structures. The results from this study confirm that data acquired from VHR satellite image database can be used as DL training dataset and indicate the potential to improve efficiency of the matching process by appropriate configuration of multi-sensor images. DL-based image matching techniques using multi-sensor VHR satellite images are expected to replace existing manual-based feature extraction methods based on its stable performance, thus further develop into an integrated DL-based image registration framework.

Evaluation of bias and uncertainty in snow depth reanalysis data over South Korea (한반도 적설심 재분석자료의 오차 및 불확실성 평가)

  • Jeon, Hyunho;Lee, Seulchan;Lee, Yangwon;Kim, Jinsoo;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.9
    • /
    • pp.543-551
    • /
    • 2023
  • Snow is an essential climate factor that affects the climate system and surface energy balance, and it also has a crucial role in water balance by providing solid water stored during the winter for spring runoff and groundwater recharge. In this study, statistical analysis of Local Data Assimilation and Prediction System (LDAPS), Modern.-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), and ERA5-Land snow depth data were used to evaluate the applicability in South Korea. The statistical analysis between the Automated Synoptic Observing System (ASOS) ground observation data provided by the Korea Meteorological Administration (KMA) and the reanalysis data showed that LDAPS and ERA5-Land were highly correlated with a correlation coefficient of more than 0.69, but LDAPS showed a large error with an RMSE of 0.79 m. In the case of MERRA-2, the correlation coefficient was lower at 0.17 because the constant value was estimated continuously for some periods, which did not adequately simulate the increase and decrease trend between data. The statistical analysis of LDAPS and ASOS showed high and low performance in the nearby Gangwon Province, where the average snowfall is relatively high, and in the southern region, where the average snowfall is low, respectively. Finally, the error variance between the four independent snow depth data used in this study was calculated through triple collocation (TC), and a merged snow depth data was produced through weighting factors. The reanalyzed data showed the highest error variance in the order of LDAPS, MERRA-2, and ERA5-Land, and LDAPS was given a lower weighting factor due to its higher error variance. In addition, the spatial distribution of ERA5-Land snow depth data showed less variability, so the TC-merged snow depth data showed a similar spatial distribution to MERRA-2, which has a low spatial resolution. Considering the correlation, error, and uncertainty of the data, the ERA5-Land data is suitable for snow-related analysis in South Korea. In addition, it is expected that LDAPS data, which is highly correlated with other data but tends to be overestimated, can be actively utilized for high-resolution representation of regional and climatic diversity if appropriate corrections are performed.