• 제목/요약/키워드: probability models

검색결과 1,123건 처리시간 0.031초

기계학습을 이용한 수출신용보증 사고예측 (The Prediction of Export Credit Guarantee Accident using Machine Learning)

  • 조재영;주지환;한인구
    • 지능정보연구
    • /
    • 제27권1호
    • /
    • pp.83-102
    • /
    • 2021
  • 2020년 8월 정부는 한국판 뉴딜을 뒷받침하기 위한 공공기관의 역할 강화방안으로서 각 공공기관별 역량을 바탕으로 5대 분야에 걸쳐 총 20가지 과제를 선정하였다. 빅데이터(Big Data), 인공지능 등을 활용하여 대국민 서비스를 제고하고 공공기관이 보유한 양질의 데이터를 개방하는 등의 다양한 정책을 통해 한국판 뉴딜(New Deal)의 성과를 조기에 창출하고 이를 극대화하기 위한 다양한 노력을 기울이고 있다. 그중에서 한국무역보험공사(KSURE)는 정책금융 공공기관으로 국내 수출기업들을 지원하기 위해 여러 제도를 운영하고 있는데 아직까지는 본 기관이 가지고 있는 빅데이터를 적극적으로 활용하지 못하고 있는 실정이다. 본 연구는 한국무역보험공사의 수출신용보증 사고 발생을 사전에 예측하고자 공사가 보유한 내부 데이터에 기계학습 모형을 적용하였고 해당 모형 간에 예측성과를 비교하였다. 예측 모형으로는 로지스틱(Logit) 회귀모형, 랜덤 포레스트(Random Forest), XGBoost, LightGBM, 심층신경망을 사용하였고, 평가 기준으로는 전체 표본의 예측 정확도 이외에도 표본별 사고 확률을 구간으로 나누어 높은 확률로 예측된 표본과 낮은 확률로 예측된 경우의 정확도를 서로 비교하였다. 각 모형별 전체 표본의 예측 정확도는 70% 내외로 나타났고 개별 표본을 사고 확률 구간별로 세부 분석한 결과 양 극단의 확률구간(0~20%, 80~100%)에서 90~100%의 예측 정확도를 보여 모형의 현실적 활용 가능성을 보여주었다. 제2종 오류의 중요성 및 전체적 예측 정확도를 종합적으로 고려할 경우, XGBoost와 심층신경망이 가장 우수한 모형으로 평가되었다. 랜덤포레스트와 LightGBM은 그 다음으로 우수하며, 로지스틱 회귀모형은 가장 낮은 성과를 보였다. 본 연구는 한국무역보험공사의 빅데이터를 기계학습모형으로 분석해 업무의 효율성을 높이는 사례로서 향후 기계학습 등을 활용하여 실무 현장에서 빅데이터 분석 및 활용이 활발해지기를 기대한다.

A Comparison of Urban Growth Probability Maps using Frequency Ratio and Logistic Regression Methods

  • Park, So-Young;Jin, Cheung-Kil;Kim, Shin-Yup;Jo, Gyung-Cheol;Choi, Chul-Uong
    • 한국조경학회지
    • /
    • 제38권5_2호
    • /
    • pp.194-205
    • /
    • 2010
  • To predict urban growth according to changes in landcover, probability factors werecal culated and mapped. Topographic, geographic and social and political factors were used as prediction variables for constructing probability maps of urban growth. Urban growth-related factors included elevation, slope, aspect, distance from road,road ratio, distance from the main city, land cover, environmental rating and legislative rating. Accounting for these factors, probability maps of urban growth were constr uctedusing frequency ratio (FR) and logistic regression (LR) methods and the effectiveness of the results was verified by the relative operating characteristic (ROC). ROC values of the urban growth probability index (UGPI) maps by the FR and LR models were 0.937 and 0.940, respectively. The LR map had a slightly higher ROC value than the FR map, but the numerical difference was slight, with both models showing similar results. The FR model is the simplest tool for probability analysis of urban growth, providing a faster and easier calculation process than other available tools. Additionally, the results can be easily interpreted. In contrast, for the LR model, only a limited amount of input data can be processed by the statistical program and a separate conversion process for input and output data is necessary. In conclusion, although the FR model is the simplest way to analyze the probability of urban growth, the LR model is more appropriate because it allows for quantitative analysis.

부적합률의 다중검정을 위한 베이지안절차 (Bayesian Procedure for the Multiple Test of Fraction Nonconforming)

  • 김경숙;김희정;나명환;손영숙
    • 품질경영학회지
    • /
    • 제34권1호
    • /
    • pp.73-77
    • /
    • 2006
  • In this paper, the Bayesian procedure for the multiple test of fraction nonconforming, p, is proposed. It is the procedure for checking whether the process is out of control, in control, or under the permissible level for p. The procedure is as follows: first, setting up three types of models, $M_1:p=p_0,\;M_2:pp_0$, second, computing the posterior probability of each model. and then choosing the model with the largest posterior probability as a model most fitted for the observed sample among three competitive models. Finally, the simulation study is performed to examine the proposed method.

PERFORMANCE ANALYSIS OF A STATISTICAL MULTIPLEXER WITH THREE-STATE BURSTY SOURCES

  • Choi, Bong-Dae;Jung, Yong-Wook
    • 대한수학회논문집
    • /
    • 제14권2호
    • /
    • pp.405-423
    • /
    • 1999
  • We consider a statistical multiplexer model with finite buffer capacity and finite number of independent identical 3-state bursty voice sources. The burstiness of the sources is modeled by describing both two different active periods (at the rate of one packet perslot) and the passive periods during which no packets are generated. Assuming a mixture of two geometric distributions for active period and a geometric distribution for passive period and geometric distribution for passive period, we derive the recursive algorithm for the probability mass function of the buffer contents (in packets). We also obtain loss probability and the distribution of packet delay. Numerical results show that the system performance deteriorates considerably as the variance of the active period increases. Also, we see that the loss probability of 2-state Markov models is less than that of 3-state Markov models.

  • PDF

부적합률의 다중검정을 위한 베이지안절차 (Bayesian Procedure for the Multiple Test of Fraction Nonconforming)

  • 김경숙;김희정;나명환;손영숙
    • 한국품질경영학회:학술대회논문집
    • /
    • 한국품질경영학회 2006년도 춘계학술대회
    • /
    • pp.325-329
    • /
    • 2006
  • In this paper, the Bayesian procedure for the multiple test of fraction nonconforming, p, is proposed. It is the procedure for checking whether the process is out of control, in control, or under the permissible level for p. The procedure is as follows: first, setting up three types of models, $M_1:p=p_0,\;M_2:pp_0$, second, computing the posterior probability of each model. and then choosing the model with the largest posterior probability as a model most fitted for the observed sample among three competitive models. Finally, the simulation study is performed to examine the proposed method.

  • PDF

Applications of Harmony Search in parameter estimation of probability distribution models for non-homogeneous hydro-meteorological extreme events

  • Lee, Tae-Sam;Yoon, Suk-Min;Gang, Myung-Kook;Shin, Ju-Young;Jung, Chang-Sam
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2012년도 학술발표회
    • /
    • pp.258-258
    • /
    • 2012
  • In frequency analyses of hydrological data, it is necessary for the interested variables to be homogenous and independent. However, recent evidences have shown that the occurrence of extreme hydro-meteorological events is influenced by large-scale climate variability, and the assumption of homogeneity does not generally hold anymore. Therefore, in order to associate the non-homogenous characteristics of hydro-meteorological variables, we propose the parameter estimation method of probability models using meta-heuristic algorithms, specifically harmony search. All the weather stations in South Korea were employed to demonstrate the performance of the proposed approaches. The results showed that the proposed parameter estimation method using harmony search is a comparativealternative for the probability distribution of the non-homogenous hydro-meteorological variables data.

  • PDF

INTRODUCTION OF THREE FUNCTIONAL MODELS MATCHED TO THE STOCHASTIC RESPONSE EVALUATION OF ACOUSTIC ENVIRONMENTAL SYSTEM AND ITS APPLICATION TO A SOUND INSULATION SYSTEM

  • Ohta, Mitsuo;Fujita, Yoshifumi
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.686-691
    • /
    • 1994
  • For evaluating the response fluctuation of the actual environmental acoustic system excited by arbitrary random inputs, it is important to predict a whole probability distribution form closely connected with evaluation indexes Lx, Leq and so on. In this paper, a new type evaluation method is proposed by introducing three functional models matched to the prediction of the response probability distribution from a problem-oriented viewpoint. Because of the positive variable of the sound intensity, the response probability density function can be reasonably expressed theoretically by a statistical Laguerre expansion series form. The relationship between input and output is described by the regression relationship between the distribution parameters(containing expansion coefficients of this expression) and the stochastic input. These regression functions are expressed in terms of the orthogonal series expansion and their parameters are determined based on the least-squares error criterion and the measure of statistical independency.

  • PDF

MATHEMATICAL ANALYSIS OF AN "SIR" EPIDEMIC MODEL IN A CONTINUOUS REACTOR - DETERMINISTIC AND PROBABILISTIC APPROACHES

  • El Hajji, Miled;Sayari, Sayed;Zaghdani, Abdelhamid
    • 대한수학회지
    • /
    • 제58권1호
    • /
    • pp.45-67
    • /
    • 2021
  • In this paper, a mathematical dynamical system involving both deterministic (with or without delay) and stochastic "SIR" epidemic model with nonlinear incidence rate in a continuous reactor is considered. A profound qualitative analysis is given. It is proved that, for both deterministic models, if ��d > 1, then the endemic equilibrium is globally asymptotically stable. However, if ��d ≤ 1, then the disease-free equilibrium is globally asymptotically stable. Concerning the stochastic model, the Feller's test combined with the canonical probability method were used in order to conclude on the long-time dynamics of the stochastic model. The results improve and extend the results obtained for the deterministic model in its both forms. It is proved that if ��s > 1, the disease is stochastically permanent with full probability. However, if ��s ≤ 1, then the disease dies out with full probability. Finally, some numerical tests are done in order to validate the obtained results.

Comparison of machine learning techniques to predict compressive strength of concrete

  • Dutta, Susom;Samui, Pijush;Kim, Dookie
    • Computers and Concrete
    • /
    • 제21권4호
    • /
    • pp.463-470
    • /
    • 2018
  • In the present study, soft computing i.e., machine learning techniques and regression models algorithms have earned much importance for the prediction of the various parameters in different fields of science and engineering. This paper depicts that how regression models can be implemented for the prediction of compressive strength of concrete. Three models are taken into consideration for this; they are Gaussian Process for Regression (GPR), Multi Adaptive Regression Spline (MARS) and Minimax Probability Machine Regression (MPMR). Contents of cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate and age in days have been taken as inputs and compressive strength as output for GPR, MARS and MPMR models. A comparatively large set of data including 1030 normalized previously published results which were obtained from experiments were utilized. Here, a comparison is made between the results obtained from all the above mentioned models and the model which provides the best fit is established. The experimental results manifest that proposed models are robust for determination of compressive strength of concrete.

Models for Internet Traffic Sharing in Computer Network

  • Alrusaini, Othman A.;Shafie, Emad A.;Elgabbani, Badreldin O.S.
    • International Journal of Computer Science & Network Security
    • /
    • 제21권8호
    • /
    • pp.28-34
    • /
    • 2021
  • Internet Service Providers (ISPs) constantly endeavor to resolve network congestion, in order to provide fast and cheap services to the customers. This study suggests two models based on Markov chain, using three and four access attempts to complete the call. It involves a comparative study of four models to check the relationship between Internet Access sharing traffic, and the possibility of network jamming. The first model is a Markov chain, based on call-by-call attempt, whereas the second is based on two attempts. Models III&IV suggested by the authors are based on the assumption of three and four attempts. The assessment reveals that sometimes by increasing the number of attempts for the same operator, the chances for the customers to complete the call, is also increased due to blocking probabilities. Three and four attempts express the actual relationship between traffic sharing and blocking probability based on Markov using MATLAB tools with initial probability values. The study reflects shouting results compared to I&II models using one and two attempts. The success ratio of the first model is 84.5%, and that of the second is 90.6% to complete the call, whereas models using three and four attempts have 94.95% and 95.12% respectively to complete the call.