• Title/Summary/Keyword: sample selection

Search Result 681, Processing Time 0.028 seconds

Korean women wage analysis using selection models (표본 선택 모형을 이용한 국내 여성 임금 데이터 분석)

  • Jeong, Mi Ryang;Kim, Mijeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1077-1085
    • /
    • 2017
  • In this study, we have found the major factors which affect Korean women's wage analysing the data provided by 2015 Korea Labor Panel Survey (KLIPS). In general, wage data is difficult to analyze because random sampling is infeasible. Heckman sample selection model is the most widely used method for analysing the data with sample selection. Heckman proposed two kinds of selection models: the one is the model with maximum likelihood method and the other is the Heckman two stage model. Heckman two stage model is known to be robust to the normal assumption of bivariate error terms. Recently, Marchenko and Genton (2012) proposed the Heckman selectiont model which generalizes the Heckman two stage model and concluded that Heckman selection-t model is more robust to the error assumptions. Employing the two models, we carried out the analysis of the data and we compared those results.

Variable Bandwidth Selection for Kernel Regression

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.5 no.1
    • /
    • pp.11-20
    • /
    • 1994
  • In recent years, nonparametric kernel estimation of regresion function are abundant and widely applicable to many areas of statistics. Most of modern researches concerned with the fixed global bandwidth selection which can be used in the estimation of regression function with all the same value for all x. In this paper, we propose a method for selecting locally varing bandwidth based on bootstrap method in kernel estimation of fixed design regression. Performance of proposed bandwidth selection method for finite sample case is conducted via Monte Carlo simulation study.

  • PDF

Selection of Geospatial Features for Location Guidance Map Generation

  • Kakinohana, Issei;Nie, Yoshinori;Nakamura, Morikazu;Miyagi, Hayao;Onaga, Kenji
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.1107-1110
    • /
    • 2000
  • This paper proposes a selection procedure of geospatial data for location guidance map generation system. The selection procedure requires some targets appointed by users as input data and outputs generation. The procedure is embedded in a prototype of object-oriented GIS. We show sample maps generated by the system.

  • PDF

A Bayes Sequential Selection of the Least Probale Event

  • Hwang, Hyung-Tae;Kim, Woo-Chul
    • Journal of the Korean Statistical Society
    • /
    • v.11 no.1
    • /
    • pp.25-35
    • /
    • 1982
  • A problem of selecting the least probable cell in a multinomial distribution is studied in a Bayesian framework. We consider two loss components the cost of sampling and the difference in cell probabilities between the selected and the least probable cells. A Bayes sequential selection rule is derived with respect to a Dirichlet prior, and it is compared with the best fixed sample size selection rule. The continuation sets with respect to the vague prior are tabulated for certain cases.

  • PDF

Selection Problems in terms of Coefficients of Vairiation

  • Park, Chi-Hoon;Jeon, Jong-Woo;Kim, Woo-Chul
    • Journal of the Korean Statistical Society
    • /
    • v.11 no.1
    • /
    • pp.12-24
    • /
    • 1982
  • Selection procedures are proposed for selecting the 'best' industrial process with the smallest fraction defective. For normally distributed industrial processes, this is equivalent to selecting in terms of coefficients of variation. For the case of known vairances, selection procedures by Bechhofer (1954), and Bechhofer and Turnball (1978) are appropriate. We treat this problem for the case of uknown variances with or without reference to a standard. The large sample solutions of design constants are tabulated and the performance of these approximate solutions are investigated.

  • PDF

Missing Type I AGNs in the local universe

  • Kim, Ji Gang;Kim, Jae Hyuk;Lee, Seung Eon;Park, Daeseong;Woo, Jong-Hak;Kwon, HongJin
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.37 no.2
    • /
    • pp.83.2-83.2
    • /
    • 2012
  • Type I AGNs are classified by the presence of broad emission lines while Type II AGNs show narrow emission lines only. All-sky surveys such as SDSS provide large AGN samples for statistical studies. However, the AGN samples suffer selection bias due to the incomplete selection criteria. To investigate the missing Type I AGNs in optical spectroscopic surveys, we start with a sample of SDSS Type II AGNs at 0.02 < z < 0.05, using the MPA-JHU SDSS DR7 catalog. We search for the hidden broad $H{\alpha}$ component with both visual inspection and the multi-component spectral decomposition method. Out of 1383 Type II AGNs, we find a total of 62 missing Type I AGNs (~4.5%). The sample has mean black hole mass, log $(M_{BH}/M_{SUN))=6.48{\pm}0.53$, and luminosity, log $(L_{H{\alpha}}/ergs^{-1})=40.52{\pm}0.33$, with Eddington ratio, log $(L_{bol}/L_{Edd})=-1.51{\pm}0.41$. We will describe the sample and present the $M_{BH}-{\sigma}_*$, and $M_{BH}-M_*$ relations of the sample in the context of the BH-galaxy coevolution.

  • PDF

Selection of Superior Trees for Larger Fruit and High Productivity in Sorbus commixta Hedl.

  • Kim, Sea-Hyun;Jang, Yong-Seok;Chung, Hun-Gwan;Choi, Myoung-Sub;Kim, Sun-Chang
    • Plant Resources
    • /
    • v.6 no.2
    • /
    • pp.120-128
    • /
    • 2003
  • The objectives of this study, an analysis of the variation for leaf and fruit characteristics among the selected ten populations of Sorbus commixta Hedl. could be used for the conservation of gene resources and could provide information to superior trees selection. The results obtained from this study can be summarized as follows; Approximately, the Mt. Sungin population at Ulleung island showed larger values in overall characteristics and populations. On the other hand, Mt. Halla population at Jeju island showed the smaller values of the overall characteristics and populations. ANOV A tests showed that there were statistically significant differences in all leaf characteristics among the populations as well as individual trees within populations. But, for fruit characteristics, differences were statistically significant only among the populations. Cluster analysis using single linkage method based on leaf and fruit characteristics showed that ten selected populations of S. commixta in Korea could be clustered into three groups. Group I is Mt. Sungin at Ulleung island, Group II is Mt. Halla at Jeju island, and Group III comprises Osan, Mt. Kaji, Mt. Duckyoo, Mt. Balwang, Mt. Sobaek, Mt. O-dae, Mt. Jiri, and Mt. Taebaek. The selection level based on major agronomic traits, which are the Number of Fruit per Fruiting Lateral(NFL) over 50, and Fruit Length(FL) and Width(FW) over 10 mm, and Weight of 100 Fruit(WFI00) over 66 g, was applied on 100 sample trees, and five trees were selected. The selection effects from selected trees in NFL, FL, FW, and WF100 were evaluated as 132%, 151 %, 142%, and 264% compared to the mean of those 100 sample trees, respectively. Especially, Ulleung 2 showed excellent values that NFL and WFI00 were 95, and 69 g, respectively, suggesting a promising new cultivar for larger fruit and high productivity.

  • PDF

Study on the Coefficient of Consolidation of Marine Clay by Rowecell Consolidation Test (ROWECELL시험에 의한 해성점토의 압밀계수에 대한 연구)

  • 김종국;차영일;김혁기;김영웅
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 2003.03a
    • /
    • pp.725-732
    • /
    • 2003
  • It was achieved that Rowecell test for this undisturbed sample was picked by Block sampler($\phi$:250mm, L:500mm) and hydraulic piston sampler($\phi$:76mm, L:850mm) in the marine clay of YONGYUDO and YEONGJONGDO in this research. Ratio of coefficient of consolidation was analyzed through comparison with C$\_$h/ by CPTu and C$\_$v/ and C$\_$h/ by existent consolidation test. According to analysis, coefficient of consolidation of block sample is fairly greater than coefficient of consolidation of piston sample. And the bigger diameter of undisturbed sample, sample disturbance could know decreasing. Coefficient of consolidation by Rowecell test measured more greatly than coefficient of consolidation by existent consolidation test. Rowecell test could know decreasing consolidation rate because of smear effect by Mandrel injection. Also, C$\_$h/ by CPTu shows deviation by each analysis method, selection of suitable analysis method judged by important leading in the coefficient of consolidation.

  • PDF

Classifier Selection using Feature Space Attributes in Local Region (국부적 영역에서의 특징 공간 속성을 이용한 다중 인식기 선택)

  • Shin Dong-Kuk;Song Hye-Jeong;Kim Baeksop
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1684-1690
    • /
    • 2004
  • This paper presents a method for classifier selection that uses distribution information of the training samples in a small region surrounding a sample. The conventional DCS-LA(Dynamic Classifier Selection - Local Accuracy) selects a classifier dynamically by comparing the local accuracy of each classifier at the test time, which inevitably requires long classification time. On the other hand, in the proposed approach, the best classifier in a local region is stored in the FSA(Feature Space Attribute) table during the training time, and the test is done by just referring to the table. Therefore, this approach enables fast classification because classification is not needed during test. Two feature space attributes are used entropy and density of k training samples around each sample. Each sample in the feature space is mapped into a point in the attribute space made by two attributes. The attribute space is divided into regular rectangular cells in which the local accuracy of each classifier is appended. The cells with associated local accuracy comprise the FSA table. During test, when a test sample is applied, the cell to which the test sample belongs is determined first by calculating the two attributes, and then, the most accurate classifier is chosen from the FSA table. To show the effectiveness of the proposed algorithm, it is compared with the conventional DCS -LA using the Elena database. The experiments show that the accuracy of the proposed algorithm is almost same as DCS-LA, but the classification time is about four times faster than that.