• Title/Summary/Keyword: selection bias

Search Result 328, Processing Time 0.035 seconds

A Study on the Bias Reduction in Split Variable Selection in CART

  • Song, Hyo-Im;Song, Eun-Tae;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.553-562
    • /
    • 2004
  • In this short communication we discuss the bias problems of CART in split variable selection and suggest a method to reduce the variable selection bias. Penalties proportional to the number of categories or distinct values are applied to the splitting criteria of CART. The results of empirical comparisons show that the proposed modification of CART reduces the bias in variable selection.

Bayesian estimation for finite population proportion under selection bias via surrogate samples

  • Choi, Seong Mi;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1543-1550
    • /
    • 2013
  • In this paper, we study Bayesian estimation for the finite population proportion in binary data under selection bias. We use a Bayesian nonignorable selection model to accommodate the selection mechanism. We compare four possible estimators of the finite population proportions based on data analysis as well as Monte Carlo simulation. It turns out that nonignorable selection model might be useful for weekly biased samples.

Pairwise pseudolikelihood approach for adjusting selection bias in meta-analysis (메타분석의 선택 편향 보정을 위한 쌍별 유사가능도 접근법)

  • Kuk, Sunghee;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.4
    • /
    • pp.439-449
    • /
    • 2020
  • Meta-analysis provides a way of integrating several independent studies of interest. Since small studies with statistically significant results are more likely to be published, publication bias, which is a special case of selection bias, often occurs in meta analysis. Conditional likelihood and weighted estimating equation have been proposed to deal with publication bias, but they require to specify a correct selection probability model. In contrast, the pairwise pseudolikelihood approach can correct publication bias without fully specifying the correct selection probability model, but its performance in meta-analysis was not investigated. In this paper, we perform a numerical study about whether the pairwise pseudolikelihood approach is effective for solving publication bias arising from typical meta-analysis settings.

Estimation of Wage Equation for College Graduates with Correction for Selection Bias upon Working State (대졸청년층의 취업지역에 대한 자기선택을 고려한 임금함수 추정)

  • Lee, Chiho
    • Journal of Labour Economics
    • /
    • v.42 no.3
    • /
    • pp.39-74
    • /
    • 2019
  • In this paper, the wage equations of local labor markets for college graduates in Korea are estimated by Dahl(2002)'s methodology to correct for selection bias. The results suggest that the variations of coefficients in wage equations across the local labor markets are mostly remained after correcting for selection bias. The gender wage gap is hardly affected by selection bias. The variations of return to education and the major premium are reduced about 18% and 11% respectively. Meanwhile, the selection bias is negligible in the national capital region, which suggests that college graduates prefer the national capital region regardless of their gender, level of education, and major.

  • PDF

Bias Reduction in Split Variable Selection in C4.5

  • Shin, Sung-Chul;Jeong, Yeon-Joo;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.627-635
    • /
    • 2003
  • In this short communication we discuss the bias problem of C4.5 in split variable selection and suggest a method to reduce the variable selection bias among categorical predictor variables. A penalty proportional to the number of categories is applied to the splitting criterion gain of C4.5. The results of empirical comparisons show that the proposed modification of C4.5 reduces the size of classification trees.

The wage determinants applying sample selection bias (표본선택 편의를 반영한 임금결정요인 분석)

  • Park, Sungik;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1317-1325
    • /
    • 2016
  • The purpose of this paper is to explain the factors affecting the wage of the vocational high school graduates. We particularly examine the effectiveness of controlling sample selection bias by employing the Tobit model and Heckman sample selection model. The major results are as follows. First it is shown that the Tobit model and Heckman sample selection model controlling sample selection bias is statistically significant. Hence all the independent variables seem to be statistically consistent with the theoretical model. Second, gender was statistically significant, both in the probability of employment and the wage. Third, the employment probability and wage of Maester high school graduates were shown to be high compared to all other graduates. Fourth, the higher parent's income, the higher are both the employment probability and the wage. Finally, parents education level, high school grade, satisfaction, and a number of licenses were found to be statistically significant, both in the probability of employment and wages.

Codon Usage Patterns of Tyrosinase Genes in Clonorchis sinensis

  • Bae, Young-An
    • Parasites, Hosts and Diseases
    • /
    • v.55 no.2
    • /
    • pp.175-183
    • /
    • 2017
  • Codon usage bias (CUB) is a unique property of genomes and has contributed to the better understanding of the molecular features and the evolution processes of particular gene. In this study, genetic indices associated with CUB, including relative synonymous codon usage and effective numbers of codons, as well as the nucleotide composition, were investigated in the Clonorchis sinensis tyrosinase genes and their platyhelminth orthologs, which play an important role in the eggshell formation. The relative synonymous codon usage patterns substantially differed among tyrosinase genes examined. In a neutrality analysis, the correlation between $GC_{12}$ and $GC_3$ was statistically significant, and the regression line had a relatively gradual slope (0.218). NC-plot, i.e., $GC_3$ vs effective number of codons (ENC), showed that most of the tyrosinase genes were below the expected curve. The codon adaptation index (CAI) values of the platyhelminth tyrosinases had a narrow distribution between 0.685/0.714 and 0.797/0.837, and were negatively correlated with their ENC. Taken together, these results suggested that CUB in the tyrosinase genes seemed to be basically governed by selection pressures rather than mutational bias, although the latter factor provided an additional force in shaping CUB of the C. sinensis and Opisthorchis viverrini genes. It was also apparent that the equilibrium point between selection pressure and mutational bias is much more inclined to selection pressure in highly expressed C. sinensis genes, than in poorly expressed genes.

Suggestions to Improve Selection-Bias in Teaching or Studying Programs (교수 및 학습 프로그램 평가연구의 선별편향성 개선을 위한 제언)

  • Park, Kyoungho
    • Korean Medical Education Review
    • /
    • v.12 no.1
    • /
    • pp.3-8
    • /
    • 2010
  • This study is designed to evaluate the effectiveness of teaching or studying programs, and thus to overcome the selectionbias in studies. Selection-bias derived from unobservable characteristics in the course of participants selection of the teaching or studying programs, in the case of cross-section data instrumental variable(IV) method and two stage least square estimation were suggested as an analysis tool. Panel data were analyzed by using both fixed effect in which individual effects are captured by intercept terms and random effect estimation where an unobserved effect can be characterized as being randomly drawn from a given distribution.

A CONSISTENT AND BIAS CORRECTED EXTENSION OF AKAIKE'S INFORMATION CRITERION(AIC) : AICbc(k)

  • Kwon, Soon H.;Ueno, M.;Sugeno, M.
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.2 no.1
    • /
    • pp.41-60
    • /
    • 1998
  • This paper derives a consistent and bias corrected extension of Akaike's Information Criterion (AIC), $AIC_{bc}$, based on Kullback-Leibler information. This criterion has terms that penalize the overparametrization more strongly for small and large samples than that of AIC. The overfitting problem of the asymptotically efficient model selection criteria for small and large samples will be overcome. The $AIC_{bc}$ also provides a consistent model order selection. Thus, it is widely applicable to data with small and/or large sample sizes, and to cases where the number of free parameters is a relatively large fraction of the sample size. Relationships with other model selection criteria such as $AIC_c$ of Hurvich, CAICF of Bozdogan and etc. are discussed. Empirical performances of the $AIC_{bc}$ are studied and discussed in better model order choices of a linear regression model using a Monte Carlo experiment.

  • PDF

Statistical Problems Caused by Sample Censoring and Their Solutions -Focused on the application to consumer research- (표본중도절단에 따른 통계학적 문제와 교정방법에 관한 고찰 -소비자분야 연구에의 적용을 중심으로-)

  • 나명균
    • Journal of the Korean Home Economics Association
    • /
    • v.33 no.2
    • /
    • pp.19-27
    • /
    • 1995
  • This paper discusses the bias that results from using nonrandomly selectd samples of consumer research. A two stage system (maximum likelihood probit analysis and ordinary least square analysis) is a solution to sample selection bias. Empirical results show that correcting for sample selection bias improves the validity of consumer research results.

  • PDF