• Title/Summary/Keyword: 로지스틱 모형

Search Result 536, Processing Time 0.02 seconds

Undecided inference using logistic regression for credit evaluation (신용평가에서 로지스틱 회귀를 이용한 미결정자 추론)

  • Hong, Chong-Sun;Jung, Min-Sub
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.2
    • /
    • pp.149-157
    • /
    • 2011
  • Undecided inference could be regarded as a missing data problem such as MARand MNAR. Under the assumption of MAR, undecided inference make use of logistic regression model. The probability of default for the undecided group is obtained with regression coefficient vectors for the decided group and compare with the probability of default for the decided group. And under the assumption of MNAR, undecide dinference make use of logistic regression model with additional feature random vector. Simulation results based on two kinds of real data are obtained and compared. It is found that the misclassification rates are not much different from the rate of rawdata under the assumption of MAR. However the misclassification rates under the assumption of MNAR are less than those under the assumption of MAR, and as the ratio of the undecided group is increasing, the misclassification rates is decreasing.

Comparison of log-logistic and generalized extreme value distributions for predicted return level of earthquake (지진 재현수준 예측에 대한 로그-로지스틱 분포와 일반화 극단값 분포의 비교)

  • Ko, Nak Gyeong;Ha, Il Do;Jang, Dae Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.107-114
    • /
    • 2020
  • Extreme value distributions have often been used for the analysis (e.g., prediction of return level) of data which are observed from natural disaster. By the extreme value theory, the block maxima asymptotically follow the generalized extreme value distribution as sample size increases; however, this may not hold in a small sample case. For solving this problem, this paper proposes the use of a log-logistic (LLG) distribution whose validity is evaluated through goodness-of-fit test and model selection. The proposed method is illustrated with data from annual maximum earthquake magnitudes of China. Here, we present the predicted return level and confidence interval according to each return period using LLG distribution.

Analysis of Stress level of Korean Household Members due to Household Debt (한국국민의 가계 금융부채에 대한 체감도 분석)

  • Oh, Man-Suk;Hyun, Seung-Me
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.2
    • /
    • pp.297-307
    • /
    • 2009
  • Korean household debt is one of the main sources of the current financial crisis. This paper studies the impact of household members' attributes such as a type of housing(self-own or rent), education, age, average monthly income of the head of household, and the area of residence, on the stress level of the household members due to household debt. We analyze a real data set collected by KB Kookmin Bank in 2004. We consider low and high stress level as a binary response variable and use a logistic regression model with the attributes of household members as explanatory variables. A simple but well-fitting model is selected by backward elimination method based on the likelihood statistic for goodness-of-fit test, and the impact of the attributes on the stress level is studied from parameter estimates of the selected model. We also perform the similar analysis on a binary response variable which distinguishes households with no debt from the rest. From the analysis, the stress level tends to be low for households with self-own houses, high average monthly income, low education level, and young members.

A study on log-density with log-odds graph for variable selection in logistic regression (로지스틱회귀모형의 변수선택에서 로그-오즈 그래프를 통한 로그-밀도비 연구)

  • Kahng, Myung-Wook;Shin, Eun-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.99-111
    • /
    • 2012
  • The log-density ratio of the conditional densities of the predictors given the response variable provides useful information for variable selection in the logistic regression model. In this paper, we consider the predictors that are needed and how they should be included in the model. If the conditional distributions are skewed, the distributions can be considered as gamma distributions. Under this assumption, linear and log terms are generally included in the model. The log-odds graph is a very useful graphical tool in this study. A graphical study is presented which shows that if the conditional distributions of x|y for the two groups overlap significantly, we need both the linear and quadratic terms. On the contrary, if they are well separated, only the linear or log term is needed in the model.

Analysis on the Survivor's Pension Payment with Logistic Regression Model (로지스틱 회귀모형을 이용한 유족연금 수급 분석)

  • Kim, Mi-Jung;Kim, Jin-Hyung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.2
    • /
    • pp.183-200
    • /
    • 2008
  • Research for efficient management of the National Pension has been emphasized as the current society trends toward aging and low birth rate. In this article, we suggest a statistical model for effective classification and prediction of the reserve for the survivor's pension in Korea. Logistic regression model is incorporated; correct classification rate, and distribution of the posterior probability for the reserve of survivor's pension are investigated and compared with the results from the general logistic models. Assessment of predictive model is also done with lift graph, ROC curve and K-S statistic. We suggest strategies for reducing financial risks in managing and planning the pension as an application of the suggested model.

Prediction of fine dust PM10 using a deep neural network model (심층 신경망모형을 사용한 미세먼지 PM10의 예측)

  • Jeon, Seonghyeon;Son, Young Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.265-285
    • /
    • 2018
  • In this study, we applied a deep neural network model to predict four grades of fine dust $PM_{10}$, 'Good, Moderate, Bad, Very Bad' and two grades, 'Good or Moderate and Bad or Very Bad'. The deep neural network model and existing classification techniques (such as neural network model, multinomial logistic regression model, support vector machine, and random forest) were applied to fine dust daily data observed from 2010 to 2015 in six major metropolitan areas of Korea. Data analysis shows that the deep neural network model outperforms others in the sense of accuracy.

The estimation of winning rate in Korean professional baseball league (한국 프로야구의 승률 추정)

  • Kim, Soon-Kwi;Lee, Young-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.653-661
    • /
    • 2016
  • In this paper, we provide a suitable optimal exponent in the generalized Pythagorean theorem and propose to use the logistic model & the probit model to estimate the winning rate in Korean professional baseball league. Under a criterion of root-mean-square-error (RMSE), the efficiencies of the proposed models have been compared with those of the Pythagorean theorem. We use the team historic win-loss records of Korean professional baseball league from 1982 to the first half of 2015, and the proposed methods show slight outperformances over the generalized Pythagorean method under the criterion of RMSE.

Evaluation of EBLUP-Type Estimator Based on a Logistic Linear Mixed Model for Small Area Unemployment (소지역 실업자수 추정을 위한 로지스틱 선형혼합모형 기반 EBLUP 타입 추정량 평가)

  • Kim, Seo-Young;Kwon, Soon-Pil
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.891-908
    • /
    • 2010
  • In Korea, the small area estimation method is currently unpopular in generating o cial statistics. Because it may be difficult to determine the reliability for small area estimation, although small area estimation ha a sufficiently good advantage to generate small area statistics for Korea. This paper inspects the method of making small area unemployment through the small area estimation method. To estimate small area unemployment we used an EBLUP-type estimator based on a logistic linear mixed model. To evaluate the EBLUP-type estimator we accomplished the real data analysis and simulation experiment from the population and housing census data. In addition, small area estimates are compared to large sample survey estimates. We found the provided method in this paper is highly recommendable to generate small area unemployment as the official statistics.