• Title/Summary/Keyword: Logit model

Search Result 705, Processing Time 0.023 seconds

Assessment of the Willingness to Pay for Forest Management in the Upstream for Water Quality Improvement within the Han River Watershed (수질개선을 위한 한강 수계 상류지역 산림관리 지불의사금액 추정)

  • Kim, Dong-Hyun;Kim, Chul-Sang;Lee, Ho-Sang;Park, Kyung-Seok;Mun, Ji-Min;Jeon, Hyon-Sun
    • Journal of Environmental Policy
    • /
    • v.14 no.2
    • /
    • pp.49-72
    • /
    • 2015
  • Forests in the upstream contributed to improve the quality of water resources for the residents downstream. However, upon structural examination of how the Han River Watershed Management Fund was spent, it became apparent that the fund was not spent toward forest management in the upstream. An additional budget must be allocated if the Watershed Management Committee is to contribute to the management of the upstream forests with such awareness. Therefore, the aim of the study was to assess the willingness to pay and to calculate of budget for forest management in the upstream for water quality improvement. Three hundred surveys on watershed beneficiaries were conducted using biased sampling method. The result was analyzed with conditional logit model and mixed logit model. Forest management, a target variable, was found to have statistical significance. Based on this result, the size of the expected budget was estimated to be minimum 20,526 million won to maximum 20,928 million won.

  • PDF

Comparative study of prediction models for corporate bond rating (국내 회사채 신용 등급 예측 모형의 비교 연구)

  • Park, Hyeongkwon;Kang, Junyoung;Heo, Sungwook;Yu, Donghyeon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.367-382
    • /
    • 2018
  • Prediction models for a corporate bond rating in existing studies have been developed using various models such as linear regression, ordered logit, and random forest. Financial characteristics help build prediction models that are expected to be contained in the assigning model of the bond rating agencies. However, the ranges of bond ratings in existing studies vary from 5 to 20 and the prediction models were developed with samples in which the target companies and the observation periods are different. Thus, a simple comparison of the prediction accuracies in each study cannot determine the best prediction model. In order to conduct a fair comparison, this study has collected corporate bond ratings and financial characteristics from 2013 to 2017 and applied prediction models to them. In addition, we applied the elastic-net penalty for the linear regression, the ordered logit, and the ordered probit. Our comparison shows that data-driven variable selection using the elastic-net improves prediction accuracy in each corresponding model, and that the random forest is the most appropriate model in terms of prediction accuracy, which obtains 69.6% accuracy of the exact rating prediction on average from the 5-fold cross validation.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

A Study on the Theme Park Users' Choice behavior -Application of Constraints-Induced Conjoint Choice Model- (주제공원 이용자들의 선택행동 연구 -Constraints-Induced Conjoint Choice Model의 적용-)

  • 홍성권;이용훈
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.28 no.2
    • /
    • pp.18-27
    • /
    • 2000
  • The importance of constraints has been one of major issues in recreation for prediction of choice behavior; however, traditional conjoint choice model did not consider the effects of these variables or fail to integrate them into choice model adequately. The purposes of this research are (a) to estimate the effects of constraints in theme park choice behavior by the constraints-induced conjoint choice model, and (b) to test additional explanatory power of the additional constraints in this suggested model against the more parsimonious traditional model. A leading polling agency was employed to select respondents. Both alternative generating and choice set generating fractional factorial design were conducted to meet the necessary and sufficient conditions for calibration of the constraints-induced conjoint choice model. Th alternative-specific model was calibrated. The log-likelihood ratio test revealed that suggested model was accepted in the favor of the traditional model, and the goodness-of-fit($\rho$$^2$) of suggested and traditional model was 0.48427 and 0.47950, respectively. There was no difference between traditional and suggested model in estimates of attribute levels of car and shuttle bus because alternatives were created to estimate the effects of constraints independently from mode related variables. Most parameters values of constraints had the expected sign and magnitude: the results reflected the characteristics of the theme parks, such as abundance of natural attractions and poor accessibility in Everland, location of major fun rides indoor in Lotte World, city park like characteristics of Dream Land, and traffic jams in Seoul. Instead of the multinomial logit model, the nested logit model is recommended for future researches because this model more reasonably reflects the real decision-making process in park choice. Development of new methodology too integrate this hierarchical decision-making into choice model is anticipated.

  • PDF

Validity of Gravity Models for Individual Choies (개인별 선택행위에서의 동력모형의 유효성)

  • 음성직
    • Journal of Korean Society of Transportation
    • /
    • v.1 no.1
    • /
    • pp.43-47
    • /
    • 1983
  • Within the conventional transportation planning process, "trip distribution" has a significant role to play. The most widely applied trip distribution model is the gravity model, for which Wilson provided the theoretical basis in 1967. The concept of the gravity model, however, still remains ambiguous if we analyze the "trip distribution" with a disaggregate data set. Thus, this paper hypothesizes that the gravity technique is still valid even with the disaggregate data set, by proving that the estimated coefficients of the gravity model, which is derived under the principle of entropy maximization, are identical with those of the multinomial logit model, which is derived under the principle of individual utility maximization.tility maximization.

  • PDF

A Proportional Odds Mixed - Effects Model for Ordinal Data

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.471-479
    • /
    • 2007
  • This paper discusses about how to build up mixed-effects model for analysing ordinal response data by using cumulative logits. Random factors are assumed to be coming from the designed sampling scheme for choosing observational units. Since the observed responses of individuals are ordinal, a proportional odds model with two random effects is suggested. Estimation procedure for the unknown parameters in a suggested model is also discussed by an illustrated example.

  • PDF

A Mixed Model for Oredered Response Categories

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.339-345
    • /
    • 2004
  • This paper deals with a mixed logit model for ordered polytomous data. There are two types of factors affecting the response varable in this paper. One is a fixed factor with finite quantitative levels and the other is a random factor coming from an experimental structure such as a randomized complete block design. It is discussed how to set up the model for analyzing ordered polytomous data and illustrated how to estimate the paramers in the given model.

  • PDF

Analysis of methods for the model extraction without training data (학습 데이터가 없는 모델 탈취 방법에 대한 분석)

  • Hyun Kwon;Yonggi Kim;Jun Lee
    • Convergence Security Journal
    • /
    • v.23 no.5
    • /
    • pp.57-64
    • /
    • 2023
  • In this study, we analyzed how to steal the target model without training data. Input data is generated using the generative model, and a similar model is created by defining a loss function so that the predicted values of the target model and the similar model are close to each other. At this time, the target model has a process of learning so that the similar model is similar to it by gradient descent using the logit (logic) value of each class for the input data. The tensorflow machine learning library was used as an experimental environment, and CIFAR10 and SVHN were used as datasets. A similar model was created using the ResNet model as a target model. As a result of the experiment, it was found that the model stealing method generated a similar model with an accuracy of 86.18% for CIFAR10 and 96.02% for SVHN, producing similar predicted values to the target model. In addition, considerations on the model stealing method, military use, and limitations were also analyzed.

Determinants of NIMBY Attitudes of Local Residents in Jeju, Korea - An Application of Two-choice Model - (제주시 지역주민들의 님비 행위 결정요인에 대한 연구: 2변수 선택모형의 적용)

  • Kim, Hyuncheol
    • Environmental and Resource Economics Review
    • /
    • v.13 no.4
    • /
    • pp.685-715
    • /
    • 2004
  • This study applies two-choice model to identify the major determinants of NIMBY attitudes when a large-scale composting facility is built around a residential area. Using a survey data of residents in Jeju City, Korea, logit estimation is implemented. The empirical results are consistent with the implication of the specified model: a representative resident's NIMBY attitude is positively (negatively) affected by "Negative Neighborhood Characteristic Variables" ("Positive Wealth Attribute variables"). Socio-demographic variables may be summarized as mostly statistically insignificant, which implies that policy makers may have to take into consideration their region-specific socio-demographic factors instead of simply emulating the policies which have been successful elsewhere.

  • PDF

Identifying Key Factors to Affect Vehicle Inspection and Maintenance(I/M) Test Results Using a Binary Logit Model (California Case Study) (이항로짓모형을 이용한 자동차 배출가스 검사결과에 미치는 요인분석(미국 캘리포니아 사례를 중심으로))

  • Chu, Sang-Ho
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.3 s.89
    • /
    • pp.189-195
    • /
    • 2006
  • For the past decades, vehicle emissions has been a major source of air pollution in urban areas Vehicle inspection and maintenance (I/M) test programs were developed for major metropolitan areas to reduce urban air pollution. However. there are a few studies of exploring major factors to influence I/M test failure. This study develops a logit model to identify key factors affecting overall test failure, using the vehicle I/M test data from California in October 2002. The model results indicate that vehicle age, odometer reading, engine size, vehicle make, presences of emissions control equipment, and test types have significant effects on the probability of I/M test failure.