• Title/Summary/Keyword: probability prediction

Search Result 773, Processing Time 0.028 seconds

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Development of a Screening Method for Deforestation Area Prediction using Probability Model (확률모델을 이용한 산림전용지역의 스크리닝방법 개발)

  • Lee, Jung-Soo
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.11 no.2
    • /
    • pp.108-120
    • /
    • 2008
  • This paper discusses the prediction of deforestation areas using probability models from forest census database, Geographic information system (GIS) database and the land cover database. The land cover data was analyzed using remotely-sensed (RS) data of the Landsat TM data from 1989 to 2001. Over the analysis period of 12 years, the deforestation area was about 40ha. Most of the deforestation areas were attributable to road construction and residential development activities. About 80% of the deforestation areas for residential development were found within 100m of the road network. More than 20% of the deforestation areas for forest road construction were within 100m of the road network. Geographic factors and vegetation change detection (VCD) factors were used in probability models to construct deforestation occurrence map. We examined the size effect of area partition as training area and validation area for the probability models. The Bayes model provided a better deforestation prediction rate than that of the regression model.

  • PDF

An Efficient Indexing Technique for Location Prediction of Moving Objects in the Road Network Environment (도로 네트워크 환경에서 이동 객체 위치 예측을 위한 효율적인 인덱싱 기법)

  • Hong, Dong-Suk;Kim, Dong-Oh;Lee, Kang-Jun;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.9 no.1
    • /
    • pp.1-13
    • /
    • 2007
  • The necessity of future index is increasing to predict the future location of moving objects promptly for various location-based services. A representative research topic related to future index is the probability trajectory prediction technique that improves reliability using the past trajectory information of moving objects in the road network environment. However, the prediction performance of this technique is lowered by the heavy load of extensive future trajectory search in long-range future queries, and its index maintenance cost is high due to the frequent update of future trajectory. Thus, this paper proposes the Probability Cell Trajectory-Tree (PCT-Tree), a cell-based future indexing technique for efficient long-range future location prediction. The PCT-Tree reduces the size of index by rebuilding the probability of extensive past trajectories in the unit of cell, and improves the prediction performance of long-range future queries. In addition, it predicts reliable future trajectories using information on past trajectories and, by doing so, minimizes the cost of communication resulting from errors in future trajectory prediction and the cost of index rebuilding for updating future trajectories. Through experiment, we proved the superiority of the PCT-Tree over existing indexing techniques in the performance of long-range future queries.

  • PDF

A Study on the Probabilistic Prediction of Typhoons Approaching the Korean-Peninsula (한반도에 대한 태풍내습확률 산정에 관한 연구)

  • Park, Jun-Il;Yu, Hui-Jeong;Lee, Bae-Ho
    • Water for future
    • /
    • v.17 no.4
    • /
    • pp.273-279
    • /
    • 1984
  • An attempt is made to present a method of prediction for typhoons apporaching the Korean-peninsula. The method is based upon the Bayesian theorem to improve the observed (prior) probabilities of typhoons approaching the Korean sea area incorporating conditional probability. A total of 248 typhoons is collected and analyzed to establish prior probability and conditional probability according to the defined procedure. The typhoons used are those which encompassed the western Pacific area to which the Korean-peninsula is subjected. The results of examplary computations suggest that the presented method is promising for predicting approaching typhoons.

  • PDF

Detection of a Bias Level in Prediction Errors due to Input Acceleration (입력 가속에서 비롯된 예측오차 바이어스 레벨의 검출)

  • Shin, Hae-Gon;Hong, Sun-Mog
    • Journal of Sensor Science and Technology
    • /
    • v.2 no.1
    • /
    • pp.57-64
    • /
    • 1993
  • In this paper the normalized innovations squared of a Kalman filter is used to detect a bias level in prediction errors due to target accelerations. The probability density function of the normalized innovation squared is obtained for a steady state Kalman filter, and it is used to calculate the detection probability of the bias level. A typical example is given to compute the detection probability and to plot the maneuver detector operating characteristic curves.

  • PDF

Failure Probability Prediction based on probabilistic and stochastic methods in generating units (확률 통계적 기법을 이용한 발전설비 고장확률 예측)

  • Lee, Sung-Hoon;Lee, Seung-Hyuk;Kim, Jin-O;Cha, Seung-Tae;Kim, Tae-Kyun
    • Proceedings of the KIEE Conference
    • /
    • 2004.11b
    • /
    • pp.69-71
    • /
    • 2004
  • This paper presents a method to predict failure probability related to aging. To calculate failure probability, the Weibull distribution is used due to age-related reliability. The Weibull distribution has shape and scale parameters. Each estimated parameter is obtained from Data Analytic Method (Type II Censoring) which is relatively simpler and faster than the traditional calculation ways for estimating parameters. Also, this paper shows the calculation procedures of a probabilistic failure prediction through a stochastic data analysis. Consequently, the proposed methods would be likely to permit that the new deregulated environment forces utilities to reduce overall costs while maintaining an age-related reliability index.

  • PDF

A long-term tunnel settlement prediction model based on BO-GPBE with SHM data

  • Yang Ding;Yu-Jun Wei;Pei-Sen Xi;Peng-Peng Ang;Zhen Han
    • Smart Structures and Systems
    • /
    • v.33 no.1
    • /
    • pp.17-26
    • /
    • 2024
  • The new metro crossing the existing metro will cause the settlement or floating of the existing structures, which will have safety problems for the operation of the existing metro and the construction of the new metro. Therefore, it is necessary to monitor and predict the settlement of the existing metro caused by the construction of the new metro in real time. Considering the complexity and uncertainty of metro settlement, a Gaussian Prior Bayesian Emulator (GPBE) probability prediction model based on Bayesian optimization (BO) is proposed, that is, BO-GPBE. Firstly, the settlement monitoring data are analyzed to get the influence of the new metro on the settlement of the existing metro. Then, five different acquisition functions, that is, expected improvement (EI), expected improvement per second (EIPS), expected improvement per second plus (EIPSP), lower confidence bound (LCB), probability of improvement (PI) are selected to construct BO model, and then BO-GPBE model is established. Finally, three years settlement monitoring data were collected by structural health monitoring (SHM) system installed on Nanjing Metro Line 10 are employed to demonstrate the effectiveness of BO-GPBE for forecasting the settlement.

Bayesian Neural Network with Recurrent Architecture for Time Series Prediction

  • Hong, Chan-Young;Park, Jung-Hun;Yoon, Tae-Sung;Park, Jin-Bae
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.631-634
    • /
    • 2004
  • In this paper, the Bayesian recurrent neural network (BRNN) is proposed to predict time series data. Among the various traditional prediction methodologies, a neural network method is considered to be more effective in case of non-linear and non-stationary time series data. A neural network predictor requests proper learning strategy to adjust the network weights, and one need to prepare for non-linear and non-stationary evolution of network weights. The Bayesian neural network in this paper estimates not the single set of weights but the probability distributions of weights. In other words, we sets the weight vector as a state vector of state space method, and estimates its probability distributions in accordance with the Bayesian inference. This approach makes it possible to obtain more exact estimation of the weights. Moreover, in the aspect of network architecture, it is known that the recurrent feedback structure is superior to the feedforward structure for the problem of time series prediction. Therefore, the recurrent network with Bayesian inference, what we call BRNN, is expected to show higher performance than the normal neural network. To verify the performance of the proposed method, the time series data are numerically generated and a neural network predictor is applied on it. As a result, BRNN is proved to show better prediction result than common feedforward Bayesian neural network.

  • PDF

A Study on Modified Linear Prediction Method to Improve Target Estimation (목표물 추정 향상을 위한 수정 선형 예측방법에 대한 연구)

  • Lee, Kwan-Hyeong;Joo, Jong-Hyuk
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.4
    • /
    • pp.337-342
    • /
    • 2016
  • In this paper, we studied a modified linear prediction method to estimate target signal correctly. Linear prediction method estimate direction-of-arrival to linear combination for any antenna element and other antenna elements. Modified linear prediction used optimal weight and posterior probability method. Through simulation, we are comparative analysis about the performance of proposed, bartlett and MUSIC method. From simulation, Bartlett and MUSIC method was estimation 3 targets signal, and proposed method estimated 4 targets. We showed the superior performance of the proposed algorithm relative to the classical method in order to estimate of target signals.

Adaptive Call Admission Control Based on Spectrum Holes Prediction in Cognitive Radio Networks (인지라디오망의 스펙트럼홀 예측기반 적응 호수락제어기법)

  • Lee, Jin-yi
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.5
    • /
    • pp.440-445
    • /
    • 2016
  • There is a scheme where secondary users (SU) use predicted spectrum holes for primary users (PU) not to utilize for efficient utilization of the limited spectrum resources in cognitive radio networks. In this paper, we propose an adaptive call admission control framework that minimizes spectrum hopping call dropped probability (SHDP) for satisfying SU quality of service (QoS). The scheme is based on a call admission control (CAC), bandwidth prediction and adaptive bandwidth assignment. The prediction model predicts not only the number of spectrum holes, but requested bandwidth of SU spectrum hopping call, and then the CAC minimizes SHDP via an adaptive bandwidth assignment in resources not being enough for reservation. We bring Wiener prediction model to predict the resources. Simulations are conducted to compare the performance of proposed scheme with an existing, and show its ability of minimizing the SHDP.