• Title/Summary/Keyword: Sample Selection Model

Search Result 199, Processing Time 0.012 seconds

The wage determinants applying sample selection bias (표본선택 편의를 반영한 임금결정요인 분석)

  • Park, Sungik;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1317-1325
    • /
    • 2016
  • The purpose of this paper is to explain the factors affecting the wage of the vocational high school graduates. We particularly examine the effectiveness of controlling sample selection bias by employing the Tobit model and Heckman sample selection model. The major results are as follows. First it is shown that the Tobit model and Heckman sample selection model controlling sample selection bias is statistically significant. Hence all the independent variables seem to be statistically consistent with the theoretical model. Second, gender was statistically significant, both in the probability of employment and the wage. Third, the employment probability and wage of Maester high school graduates were shown to be high compared to all other graduates. Fourth, the higher parent's income, the higher are both the employment probability and the wage. Finally, parents education level, high school grade, satisfaction, and a number of licenses were found to be statistically significant, both in the probability of employment and wages.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.

A Note on Parametric Bootstrap Model Selection

  • Lee, Kee-Won;Songyong Sim
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.397-405
    • /
    • 1998
  • We develop parametric bootstrap model selection criteria in an example to fit a random sample to either a general normal distribution or a normal distribution with prespecified mean. We apply the bootstrap methods in two ways; one considers the direct substitution of estimated parameter for the unknown parameter, and the other focuses on the bias correction. These bootstrap model selection criteria are compared with AIC. We illustrate that all the selection rules reduce to the one sample t-test, where the cutoff points converge to some certain points as the sample size increases.

  • PDF

An Alternative Parametric Estimation of Sample Selection Model: An Application to Car Ownership and Car Expense (비정규분포를 이용한 표본선택 모형 추정: 자동차 보유와 유지비용에 관한 실증분석)

  • Choi, Phil-Sun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.345-358
    • /
    • 2012
  • In a parametric sample selection model, the distribution assumption is critical to obtain consistent estimates. Conventionally, the normality assumption has been adopted for both error terms in selection and main equations of the model. The normality assumption, however, may excessively restrict the true underlying distribution of the model. This study introduces the $S_U$-normal distribution into the error distribution of a sample selection model. The $S_U$-normal distribution can accommodate a wide range of skewness and kurtosis compared to the normal distribution. It also includes the normal distribution as a limiting distribution. Moreover, the $S_U$-normal distribution can be easily extended to multivariate dimensions. We provide the log-likelihood function and expected value formula based on a bivariate $S_U$-normal distribution in a sample selection model. The results of simulations indicate the $S_U$-normal model outperforms the normal model for the consistency of estimators. As an empirical application, we provide the sample selection model for car ownership and a car expense relationship.

Corporate Debt Choice: Application of Panel Sample Selection Model (기업의 부채조달원 선택에 관한 연구: 패널표본선택모형의 적용)

  • Lee, Ho Sun
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.428-435
    • /
    • 2015
  • When I examined the corporate financing statistics in Korea, I have recognized that there are several trends of them. First, large enterprises use bank loan and direct financing like corporate bond as debt. Second, small and medium companies mainly use bank loan only. So I argue that there is sample selection bias in corporate debt choice and using sample selection methodology is more adequate when analysing the behavior in corporate debt choice. Therefore I have tested panel sample selection model, using the listed korean firm data from 1990 to 2013 and I have found that the panel sample selection model is appropriate.

The wage determinants of college graduates using Heckman's sample selection model (Heckman의 표본선택모형을 이용한 대졸자의 임금결정요인 분석)

  • Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1099-1107
    • /
    • 2017
  • In this study, we analyzed the determinants of wages of college graduates by using the data of "2014 Graduates Occupational Mobility Survey" conducted by Korea Employment Information Service. In general, wages contain two complex pieces of information about whether an individual is employed and the size of the wage. However, in many previous researches on wage determinants, sample selection bias tends to be generated by performing linear regression analysis using only information on wage size. We used the Heckman sample selection models for analysis to overcome this problem. The main results are summarized as follows. First, the validity of the Heckman's sample selection model is statistically significant. Male is significantly higher in both job probability and wage than female. As age increases and parents' income increases, both the probability of employment and the size of wages are higher. Finally, as the university satisfaction increases and the number of certifications acquired increased, both the probability of employment and the wage tends to increase.

A Study on Determinants Affecting At-home Laver Consumption Expenditures : Type II Tobit Model Treating Sample Selection Bias (김 가정 소비 지출의 결정 요인 분석 : 선택 편의를 고려한 Type II 토빗 모형을 이용하여)

  • Lee, Min-Kyu;Park, Eun-Young
    • The Journal of Fisheries Business Administration
    • /
    • v.40 no.3
    • /
    • pp.147-167
    • /
    • 2009
  • The objective of this study is to analyze the determinants of at-home laver consumption expenditures using the data from a survey of households implemented in 2009. It happened that non-response ratios of monthly expenditures on dry laver and flavored laver among sampled households are 18.8% and 25.6%. Accordingly, this study tries to analyze the determinants affecting at-home laver consumption expenditures by using type II tobit model, one of sample selection models, to deal with sample selection bias caused from non-response data. Analysis results show the age variable positively affects expenditures on dry laver but negatively contributes to expenditures on flavored laver. In addition, the household size, the household's income, the degree of preference for laver have positive relationships with both expenditures. Household size elasticity and income elasticity of the expenditure on dry laver are estimated as 0.220 and 0.251. In the case of flavored laver, these elasticities are estimated as 0.484 and 0.261. Such analysis results can provide information on division of the at-home laver consumption market into groups with high willingness to expense and implementation of detailed marketing strategies to increase at-home laver consumption. The methodology of this study can be applied to consumer preference analysis on other marine products and other analyses on sample with non-response data in the fishery research.

  • PDF

A CONSISTENT AND BIAS CORRECTED EXTENSION OF AKAIKE'S INFORMATION CRITERION(AIC) : AICbc(k)

  • Kwon, Soon H.;Ueno, M.;Sugeno, M.
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.2 no.1
    • /
    • pp.41-60
    • /
    • 1998
  • This paper derives a consistent and bias corrected extension of Akaike's Information Criterion (AIC), $AIC_{bc}$, based on Kullback-Leibler information. This criterion has terms that penalize the overparametrization more strongly for small and large samples than that of AIC. The overfitting problem of the asymptotically efficient model selection criteria for small and large samples will be overcome. The $AIC_{bc}$ also provides a consistent model order selection. Thus, it is widely applicable to data with small and/or large sample sizes, and to cases where the number of free parameters is a relatively large fraction of the sample size. Relationships with other model selection criteria such as $AIC_c$ of Hurvich, CAICF of Bozdogan and etc. are discussed. Empirical performances of the $AIC_{bc}$ are studied and discussed in better model order choices of a linear regression model using a Monte Carlo experiment.

  • PDF

Korean women wage analysis using selection models (표본 선택 모형을 이용한 국내 여성 임금 데이터 분석)

  • Jeong, Mi Ryang;Kim, Mijeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1077-1085
    • /
    • 2017
  • In this study, we have found the major factors which affect Korean women's wage analysing the data provided by 2015 Korea Labor Panel Survey (KLIPS). In general, wage data is difficult to analyze because random sampling is infeasible. Heckman sample selection model is the most widely used method for analysing the data with sample selection. Heckman proposed two kinds of selection models: the one is the model with maximum likelihood method and the other is the Heckman two stage model. Heckman two stage model is known to be robust to the normal assumption of bivariate error terms. Recently, Marchenko and Genton (2012) proposed the Heckman selectiont model which generalizes the Heckman two stage model and concluded that Heckman selection-t model is more robust to the error assumptions. Employing the two models, we carried out the analysis of the data and we compared those results.

Bayesian estimation for finite population proportion under selection bias via surrogate samples

  • Choi, Seong Mi;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1543-1550
    • /
    • 2013
  • In this paper, we study Bayesian estimation for the finite population proportion in binary data under selection bias. We use a Bayesian nonignorable selection model to accommodate the selection mechanism. We compare four possible estimators of the finite population proportions based on data analysis as well as Monte Carlo simulation. It turns out that nonignorable selection model might be useful for weekly biased samples.