• Title/Summary/Keyword: logistic model

Search Result 1,976, Processing Time 0.033 seconds

A credit classification method based on generalized additive models using factor scores of mixtures of common factor analyzers (공통요인분석자혼합모형의 요인점수를 이용한 일반화가법모형 기반 신용평가)

  • Lim, Su-Yeol;Baek, Jang-Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.235-245
    • /
    • 2012
  • Logistic discrimination is an useful statistical technique for quantitative analysis of financial service industry. Especially it is not only easy to be implemented, but also has good classification rate. Generalized additive model is useful for credit scoring since it has the same advantages of logistic discrimination as well as accounting ability for the nonlinear effects of the explanatory variables. It may, however, need too many additive terms in the model when the number of explanatory variables is very large and there may exist dependencies among the variables. Mixtures of factor analyzers can be used for dimension reduction of high-dimensional feature. This study proposes to use the low-dimensional factor scores of mixtures of factor analyzers as the new features in the generalized additive model. Its application is demonstrated in the classification of some real credit scoring data. The comparison of correct classification rates of competing techniques shows the superiority of the generalized additive model using factor scores.

Study on Detection Technique for Cochlodinium polykrikoides Red tide using Logistic Regression Model under Imbalanced Data (불균형 데이터 환경에서 로지스틱 회귀모형을 이용한 Cochlodinium polykrikoides 적조 탐지 기법 연구)

  • Bak, Su-Ho;Kim, Heung-Min;Kim, Bum-Kyu;Hwang, Do-Hyun;Enkhjargal, Unuzaya;Yoon, Hong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1353-1364
    • /
    • 2018
  • This study proposed a method to detect Cochlodinium polykrikoides red tide pixels in satellite images using a logistic regression model of machine learning technique under Imbalanced data. The spectral profiles extracted from red tide, clear water, and turbid water were used as training dataset. 70% of the entire data set was extracted and used for as model training, and the classification accuracy of the model was evaluated using the remaining 30%. At this time, the white noise was added to the spectral profile of the red tide, which has a relatively small number of data compared to the clear water and the turbid water, and over-sampling was performed to solve the unbalanced data problem. As a result of the accuracy evaluation, the proposed algorithm showed about 94% classification accuracy.

Inferential Problems in Bayesian Logistic Regression Models (베이지안 로지스틱 회귀모형에서의 추론에 대한 연구)

  • Hwang, Jin-Soo;Kang, Sung-Chan
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1149-1160
    • /
    • 2011
  • Model selection and hypothesis testing problems in Bayesian inference are still debated between scholars. Bayesian factors traditionally used as a criterion in Bayesian hypothesis testing and model selection, are easy to understand but sometimes hard to compute. In addition, there are other model selection criterions such as DIC(Deviance Information Criterion) by Spiegelhalter et al. (2002) and Bayesian P-values for testing. In this paper, we briefly introduce the Bayesian hypothesis testing and model selection procedure. In addition we have applied a Bayesian inference to Swiss banknote data by a fitting logistic regression model and computing several test statistics to see if they provide consistent results.

Prediction of fine dust PM10 using a deep neural network model (심층 신경망모형을 사용한 미세먼지 PM10의 예측)

  • Jeon, Seonghyeon;Son, Young Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.265-285
    • /
    • 2018
  • In this study, we applied a deep neural network model to predict four grades of fine dust $PM_{10}$, 'Good, Moderate, Bad, Very Bad' and two grades, 'Good or Moderate and Bad or Very Bad'. The deep neural network model and existing classification techniques (such as neural network model, multinomial logistic regression model, support vector machine, and random forest) were applied to fine dust daily data observed from 2010 to 2015 in six major metropolitan areas of Korea. Data analysis shows that the deep neural network model outperforms others in the sense of accuracy.

Eurasian Otter (Lutra lutra) Habitat Suitability Modeling Using GIS; A case study on Soraksan National Park

  • Park, Chong-Hwa;Joo, Wooyeong;Seo, Chang-Wan
    • Spatial Information Research
    • /
    • v.10 no.4
    • /
    • pp.501-513
    • /
    • 2002
  • Eurasian otter (Lutra lutra) is one of endangered wildlife species whose population size is declining in Korea. To manage and conserve habitat for Eurasian otter, it is crucial to understand which habitat components affect otter habitat qualities. The objectives of this study were to develop a habitat suitability model of Eurasian otter in Soraksan National Park, to validate the model in Odaesan National Park. The research methods of this study were as follows. First, trace data and characters of Eurasian otter habitat were collected with Geographic Information System (GIS) data and Global Positioning System (GPS) receivers between 2000 and 2002. Second, the habitat use factors were identified as habitat characteristics of Eurasian otter and classified with habitat use and availability analyses. Third, significant factors of habitat model were extracted by Chi-square test. The last, Eurasian Otter Habitat Suitability Model (EOHSM) was employed by logistic regression method. Otter habitat use was positively associated with the reeds and shrubs areas adjacent to streams, the size of boulders, and low human disturbance in Soraksan National Park by EOHSM. This model had a classification accuracy of 74.4% at cutoff value of 0.5. Model validation showed a classification accuracy of 86.6 % at cut off value of 0.5 for otter habitat in Odaesan National Park.

  • PDF

Forecasting the consumption of dairy products in Korea using growth models

  • Jaesung, Cho;Jae Bong, Chang
    • Korean Journal of Agricultural Science
    • /
    • v.48 no.4
    • /
    • pp.987-1001
    • /
    • 2021
  • One of the most critical issues in the dairy industry, alongside the low birth rate and the aging population, is the decrease in demand for milk. In this study, the consumption trends of 12 major dairy products distributed in Korea were predicted using a logistic model, the Gompertz model, and the Bass diffusion model, which are representative S-shaped growth models. The 12 dairy products are fermented milk (liquid type, cream type), butter, milk powder (modified, whole, skim), liquid milk (market, flavored), condensed milk, cheese (natural, processed), and cream. As a result of the analysis, the growth potential of butter, condensed milk, natural cheese, processed cheese, and cream consumption among the 12 dairy products is relatively high, whereas the growth of the remaining dairy product consumption is expected to stagnate or decrease. However, butter and cream are by-products of the skim milk powder manufacturing process. Therefore, even if the consumption of butter and cream grows, it is difficult to increase the demand of domestic milk unless the production of skim milk powder produced from domestic milk is also increased. Therefore, in order to support the domestic dairy industry, policy support should be focused on increasing domestic milk usage for the production of condensed milk, natural cheese, and processed cheese.

Designing Neural Network Using Genetic Algorithm (유전자 알고리즘을 이용한 신경망 설계)

  • Park, Jeong-Sun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.9
    • /
    • pp.2309-2314
    • /
    • 1997
  • The study introduces a neural network to predict the bankruptcy of insurance companies. As a method to optimize the network, a genetic algorithm suggests optimal structure and network parameters. The neural network designed by genetic algorithm is compared with discriminant analysis, logistic regression, ID3, and CART. The robust neural network model shows the best performance among those models compared.

  • PDF

Application of Statistical Models for Default Probability of Loans in Mortgage Companies

  • Jung, Jin-Whan
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.605-616
    • /
    • 2000
  • Three primary interests frequently raised by mortgage companies are introduced and the corresponding statistical approaches for the default probability in mortgage companies are examined. Statistical models considered in this paper are time series, logistic regression, decision tree, neural network, and discrete time models. Usage of the models is illustrated using an artificially modified data set and the corresponding models are evaluated in appropriate manners.

  • PDF

A study on the forecasting of instant messinger's users choice using neural network (인공신경망을 이용한 인스턴트 메신저 선택 예측에 관한 연구)

  • Kim Dong Sung;Kim Gye Soo
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 2004.04a
    • /
    • pp.597-602
    • /
    • 2004
  • This study examined the forecasting of instant messinger's users choice using neural network. We used the statistical methods which were Logistic Regression, MDA(Multiple Discriminant Analysis), and ANN(Artificial Neural Network). In the result, the forecasting performance of the ANN was better than conventional model(Logistic Regression, MDA).

  • PDF

A Study on V50 Calculation in Bulletproof Test using Logistic Regression Model (로지스틱 회귀모형을 활용한 방탄시험에서의 V50 산출방안)

  • Gu, Seung Hwan;Noh, Seung Min;Song, Seung Hwan
    • Journal of Korean Society for Quality Management
    • /
    • v.46 no.3
    • /
    • pp.453-464
    • /
    • 2018
  • Purpose: The purpose of this study is to propose a solution to the case where $V_{50}$ calculation is impossible in the process of bulletproof test. Methods: In this study, we proposed a $V_{50}$ estimation method using logistic regression analysis. Six scenarios were applied by combining the homogeneity of the sample and the speed range. Then, 1,000 simulations were performed per scenario and six assumptions reflecting the reality were applied. Results: The result of the study, it was confirmed that there was no statistical difference between the $V_{50}$ value calculated by the conventional method and the $V_{50}$ value calculated by the improvement method. Therefore, in situations where $V_{50}$ can not be calculated, it is reasonable to use logistic regression analysis. Conclusion: This study develops a methodology that is easy to use and reliable by using statistical model based on actual data.