• Title/Summary/Keyword: sample selection

Search Result 676, Processing Time 0.025 seconds

The wage determinants applying sample selection bias (표본선택 편의를 반영한 임금결정요인 분석)

  • Park, Sungik;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1317-1325
    • /
    • 2016
  • The purpose of this paper is to explain the factors affecting the wage of the vocational high school graduates. We particularly examine the effectiveness of controlling sample selection bias by employing the Tobit model and Heckman sample selection model. The major results are as follows. First it is shown that the Tobit model and Heckman sample selection model controlling sample selection bias is statistically significant. Hence all the independent variables seem to be statistically consistent with the theoretical model. Second, gender was statistically significant, both in the probability of employment and the wage. Third, the employment probability and wage of Maester high school graduates were shown to be high compared to all other graduates. Fourth, the higher parent's income, the higher are both the employment probability and the wage. Finally, parents education level, high school grade, satisfaction, and a number of licenses were found to be statistically significant, both in the probability of employment and wages.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.

Corporate Debt Choice: Application of Panel Sample Selection Model (기업의 부채조달원 선택에 관한 연구: 패널표본선택모형의 적용)

  • Lee, Ho Sun
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.428-435
    • /
    • 2015
  • When I examined the corporate financing statistics in Korea, I have recognized that there are several trends of them. First, large enterprises use bank loan and direct financing like corporate bond as debt. Second, small and medium companies mainly use bank loan only. So I argue that there is sample selection bias in corporate debt choice and using sample selection methodology is more adequate when analysing the behavior in corporate debt choice. Therefore I have tested panel sample selection model, using the listed korean firm data from 1990 to 2013 and I have found that the panel sample selection model is appropriate.

A Note on Parametric Bootstrap Model Selection

  • Lee, Kee-Won;Songyong Sim
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.397-405
    • /
    • 1998
  • We develop parametric bootstrap model selection criteria in an example to fit a random sample to either a general normal distribution or a normal distribution with prespecified mean. We apply the bootstrap methods in two ways; one considers the direct substitution of estimated parameter for the unknown parameter, and the other focuses on the bias correction. These bootstrap model selection criteria are compared with AIC. We illustrate that all the selection rules reduce to the one sample t-test, where the cutoff points converge to some certain points as the sample size increases.

  • PDF

The wage determinants of college graduates using Heckman's sample selection model (Heckman의 표본선택모형을 이용한 대졸자의 임금결정요인 분석)

  • Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1099-1107
    • /
    • 2017
  • In this study, we analyzed the determinants of wages of college graduates by using the data of "2014 Graduates Occupational Mobility Survey" conducted by Korea Employment Information Service. In general, wages contain two complex pieces of information about whether an individual is employed and the size of the wage. However, in many previous researches on wage determinants, sample selection bias tends to be generated by performing linear regression analysis using only information on wage size. We used the Heckman sample selection models for analysis to overcome this problem. The main results are summarized as follows. First, the validity of the Heckman's sample selection model is statistically significant. Male is significantly higher in both job probability and wage than female. As age increases and parents' income increases, both the probability of employment and the size of wages are higher. Finally, as the university satisfaction increases and the number of certifications acquired increased, both the probability of employment and the wage tends to increase.

An Alternative Parametric Estimation of Sample Selection Model: An Application to Car Ownership and Car Expense (비정규분포를 이용한 표본선택 모형 추정: 자동차 보유와 유지비용에 관한 실증분석)

  • Choi, Phil-Sun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.345-358
    • /
    • 2012
  • In a parametric sample selection model, the distribution assumption is critical to obtain consistent estimates. Conventionally, the normality assumption has been adopted for both error terms in selection and main equations of the model. The normality assumption, however, may excessively restrict the true underlying distribution of the model. This study introduces the $S_U$-normal distribution into the error distribution of a sample selection model. The $S_U$-normal distribution can accommodate a wide range of skewness and kurtosis compared to the normal distribution. It also includes the normal distribution as a limiting distribution. Moreover, the $S_U$-normal distribution can be easily extended to multivariate dimensions. We provide the log-likelihood function and expected value formula based on a bivariate $S_U$-normal distribution in a sample selection model. The results of simulations indicate the $S_U$-normal model outperforms the normal model for the consistency of estimators. As an empirical application, we provide the sample selection model for car ownership and a car expense relationship.

A Study on Determinants Affecting At-home Laver Consumption Expenditures : Type II Tobit Model Treating Sample Selection Bias (김 가정 소비 지출의 결정 요인 분석 : 선택 편의를 고려한 Type II 토빗 모형을 이용하여)

  • Lee, Min-Kyu;Park, Eun-Young
    • The Journal of Fisheries Business Administration
    • /
    • v.40 no.3
    • /
    • pp.147-167
    • /
    • 2009
  • The objective of this study is to analyze the determinants of at-home laver consumption expenditures using the data from a survey of households implemented in 2009. It happened that non-response ratios of monthly expenditures on dry laver and flavored laver among sampled households are 18.8% and 25.6%. Accordingly, this study tries to analyze the determinants affecting at-home laver consumption expenditures by using type II tobit model, one of sample selection models, to deal with sample selection bias caused from non-response data. Analysis results show the age variable positively affects expenditures on dry laver but negatively contributes to expenditures on flavored laver. In addition, the household size, the household's income, the degree of preference for laver have positive relationships with both expenditures. Household size elasticity and income elasticity of the expenditure on dry laver are estimated as 0.220 and 0.251. In the case of flavored laver, these elasticities are estimated as 0.484 and 0.261. Such analysis results can provide information on division of the at-home laver consumption market into groups with high willingness to expense and implementation of detailed marketing strategies to increase at-home laver consumption. The methodology of this study can be applied to consumer preference analysis on other marine products and other analyses on sample with non-response data in the fishery research.

  • PDF

Efficient Controlled Selection

  • Ryu, Jea-Bok;Lee, Seung-Joo
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.1
    • /
    • pp.151-159
    • /
    • 1997
  • In sample surveys, we expect preferred samples that reduce the survey cost and increase the precision of estimators will be selected. Goodman and Kish (1950) introduced controlled selection as a method of sample selection that increases the probability of drawing preferred samples, while decreases the probability of drawing nonpreferred samples. In this paper, we obtain the controlled plans using the maximum entropy principle, and when the order of nonpreferred samples is considered, we propose the algorithm to obtain a controlled plan.

  • PDF

Camera Source Identification of Digital Images Based on Sample Selection

  • Wang, Zhihui;Wang, Hong;Li, Haojie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.3268-3283
    • /
    • 2018
  • With the advent of the Information Age, the source identification of digital images, as a part of digital image forensics, has attracted increasing attention. Therefore, an effective technique to identify the source of digital images is urgently needed at this stage. In this paper, first, we study and implement some previous work on image source identification based on sensor pattern noise, such as the Lukas method, principal component analysis method and the random subspace method. Second, to extract a purer sensor pattern noise, we propose a sample selection method to improve the random subspace method. By analyzing the image texture feature, we select a patch with less complexity to extract more reliable sensor pattern noise, which improves the accuracy of identification. Finally, experiment results reveal that the proposed sample selection method can extract a purer sensor pattern noise, which further improves the accuracy of image source identification. At the same time, this approach is less complicated than the deep learning models and is close to the most advanced performance.

NEW SELECTION APPROACH FOR RESOLUTION AND BASIS FUNCTIONS IN WAVELET REGRESSION

  • Park, Chun Gun
    • Korean Journal of Mathematics
    • /
    • v.22 no.2
    • /
    • pp.289-305
    • /
    • 2014
  • In this paper we propose a new approach to the variable selection problem for a primary resolution and wavelet basis functions in wavelet regression. Most wavelet shrinkage methods focus on thresholding the wavelet coefficients, given a primary resolution which is usually determined by the sample size. However, both a primary resolution and the basis functions are affected by the shape of an unknown function rather than the sample size. Unlike existing methods, our method does not depend on the sample size and also takes into account the shape of the unknown function.