• 제목/요약/키워드: stratified random sampling design

검색결과 38건 처리시간 0.023초

확률추출에 의한 층별 샘플링의 경제성에 관한 연구 (A Study on economically optimal Determination of the Parameters of the Stratified Random Sampling)

  • 황의철;이영식
    • 산업경영시스템학회지
    • /
    • 제13권21호
    • /
    • pp.81-90
    • /
    • 1990
  • In stratified random sampling a simple random sample must be taken in each stratum to reduce the maximum gain in precision given the minimum cost. The purpose of this paper is to deal with the propertics of the estimates and variances and obtain the economic design of stratified random sampling through the optimum allocation of the sample sizes. In addition, the between stratum variation and the within stratum variation is stratifying the population are described.

  • PDF

층화추출과 계통추출을 이용한 효율적인 보조정보 사용 (Efficient Use of Auxiliary Information through the Stratified Sampling and Systematic Sampling Design)

  • 김관수;박민규
    • 한국조사연구학회지:조사연구
    • /
    • 제10권1호
    • /
    • pp.155-168
    • /
    • 2009
  • 표본설계 단계에서 이용 가능한 보조정보가 있는 경우 효율적인 표본추출방법으로 층화추출법이 흔히 고려된다. 특별히 층화변수로 이용할 수 있는 변수가 많은 경우 전체 층의 숫자가 커지게 되며, 이때 각 층으로부터 한 단위를 추출하는 층 표본크기가 1인 층화추출이 효율적임이 알려져 있다. 그러나 각 층으로부터 하나의 추출단위를 추출하는 층 표본크기가 1인 층화추출의 경우 불편 분산 추정량의 계산이 불가능하다. 불편 분산 추정량의 계산은 층의 수를 줄이고 각 층으로부터 두 개의 표본추출단위를 표집하는 층 표본크기가 2인 층화추출에서 가능하나 중요 층화변수가 누락될 경우 층 표본크기가 1인 층화추출에 비해 그 효율성이 떨어진다. 본 연구에서는 Park & Fuller(2008)에 의해 제시된 층 표본크기가 2인 균형 층화추출과 호르비츠-톰슨 추정량의 불편 분산 추정량을 살펴보고, 모의실험을 통하여 여러 가지 층화추출법과 계통추출법을 비교한다. 또한 제시된 표본추출법을 2006년 청년패널 자료에 적용하여 그 효율성을 평가한다.

  • PDF

Variance estimation for distribution rate in stratified cluster sampling with missing values

  • Heo, Sunyeong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권2호
    • /
    • pp.443-449
    • /
    • 2017
  • Estimation of population proportion like the distribution rate of LED TV and the prevalence of a disease are often estimated based on survey sample data. Population proportion is generally considered as a special form of population mean. In complex sampling like stratified multistage sampling with unequal probability sampling, the denominator of mean may be random variable and it is estimated like ratio estimator. In this research, we examined the estimation of distribution rate based on stratified multistage sampling, and determined some numerical outcomes using stratified random sample data with about 25% of missing observations. In the data used for this research, the survey weight was determined by deterministic way. So, the weights are not random variable, and the population distribution rate and its variance estimator can be estimated like population mean estimation. When the weights are not random variable, if one estimates the variance of proportion estimator using ratio method, then the variances may be inflated. Therefore, in estimating variance for population proportion, we need to examine the structure of data and survey design before making any decision for estimation methods.

과학기술연구개발활동조사의 개선방안 -기업부문을 중심으로- (Policies for Improving the Survey of Research and Development in Science and Technology: The Case of Industrial Sector)

  • 유승훈;문혜선
    • 기술혁신학회지
    • /
    • 제5권2호
    • /
    • pp.228-244
    • /
    • 2002
  • The survey of research and development (R&D) in science and technology (S&T) covers the current status of R&D activities in S&T in Korea, and provides a basis for decision making regarding S&T policy. Continuous improvement of the survey is widely needed to present reliable national basic statistics. Therefore, the purpose of the study is two-fold: to introduce sampling survey method in industrial sector and to make statistical technique to deal with non-response data from industrial sector. To these ends, first, case studies of the United States and Japan are illustrated. A new sampling design for the R&D survey is proposed and implementing stratified random sampling scheme is suggested. Moreover, statistical analysis of the non-response data is dealt with. Based on several screening criteria, we develop a new imputation method suitable for the R&D survey and also provide more detailed implementation plan. Various solutions to a problem arising from non-response item are also presented. Finally, some implications of the results are discussed.

  • PDF

Optimal Design of the Adaptive Searching Estimation in Spatial Sampling

  • Pyong Namkung;Byun, Jong-Seok
    • Communications for Statistical Applications and Methods
    • /
    • 제8권1호
    • /
    • pp.73-85
    • /
    • 2001
  • The spatial population existing in a plane ares, such as an animal or aerial population, have certain relationships among regions which are located within a fixed distance from one selected region. We consider with the adaptive searching estimation in spatial sampling for a spatial population. The adaptive searching estimation depends on values of sample points during the survey and on the nature of the surfaces under investigation. In this paper we study the estimation by the adaptive searching in a spatial sampling for the purpose of estimating the area possessing a particular characteristic in a spatial population. From the viewpoint of adaptive searching, we empirically compare systematic sampling with stratified sampling in spatial sampling through the simulation data.

  • PDF

우리나라 당뇨병의 역학적 규모와 당뇨병 관리현황 파악을 위한 표본설계의 평가 (An Evaluation of Sampling Design for Estimating an Epidemiologic Volume of Diabetes and for Assessing Present Status of Its Control in Korea)

  • 이지성;김재용;백세현;박이병;이준영
    • Journal of Preventive Medicine and Public Health
    • /
    • 제42권2호
    • /
    • pp.135-142
    • /
    • 2009
  • Objectives : An appropriate sampling strategy for estimating an epidemiologic volume of diabetes has been evaluated through a simulation. Methods : We analyzed about 250 million medical insurance claims data submitted to the Health Insurance Review & Assessment Service with diabetes as principal or subsequent diagnoses, more than or equal to once per year, in 2003. The database was re-constructed to a 'patient-hospital profile' that had 3,676,164 cases, and then to a 'patient profile' that consisted of 2,412,082 observations. The patient profile data was then used to test the validity of a proposed sampling frame and methods of sampling to develop diabetic-related epidemiologic indices. Results : Simulation study showed that a use of a stratified two-stage cluster sampling design with a total sample size of 4,000 will provide an estimate of 57.04%(95% prediction range, 49.83 - 64.24%) for a treatment prescription rate of diabetes. The proposed sampling design consists, at first, stratifying the area of the nation into "metropolitan/city/county" and the types of hospital into "tertiary/secondary/primary/clinic" with a proportion of 5:10:10:75. Hospitals were then randomly selected within the strata as a primary sampling unit, followed by a random selection of patients within the hospitals as a secondly sampling unit. The difference between the estimate and the parameter value was projected to be less than 0.3%. Conclusions : The sampling scheme proposed will be applied to a subsequent nationwide field survey not only for estimating the epidemiologic volume of diabetes but also for assessing the present status of nationwide diabetes control.

설계효과모형을 통한 설계요소의 유용성 이해 (Understanding Complex Design Features via Design Effect Models)

  • 박인호
    • 응용통계연구
    • /
    • 제28권6호
    • /
    • pp.1217-1225
    • /
    • 2015
  • 조사자료분석에 있어서 표본추정량에 대해 설계요소가 갖는 효율성은 단순확률추출과 비교한 복잡표본설계의 의한 표본추출이 주는 분산의 상대적 크기인 설계효과를 통해 평가할 수 있다. 설계효과의 유용성은 복잡설계요소의 함수형태로 표현될 수 있을때 극대화될 수 있다. 본 연구에서는 층화다단추출의 표본설계에서 적용될 수 있는 설계효과모형을 제시하였다. 제시된 설계효과모형은 기존 다단추출을 위한 Gabler 등 (1999, 2006)의 모형을 일반화한 것으로 층구조, 표본할당, 집락추출 및 불균등가중치 등의 설계요소들이 정도수준에 갖는 영향력을 함수식으로 명확히 나타내주고 있다. 이를 활용하면 사전에 기술된 추정정도를 얻기 위해 설정한 표본크기가 줄 수 있는 설계효과를 예측하는데 활용할 수 있다. 또한 사후적으로 표본설계의 개별 설계요소들이 표본추정량에 대해 갖는 효율성을 평가하는데 활용될 수 있다.

이단계표본추출을 이용한 소결핵병 유병률 추정 (Two-stage Sampling for Estimation of Prevalence of Bovine Tuberculosis)

  • 박선일
    • 한국임상수의학회지
    • /
    • 제28권4호
    • /
    • pp.422-426
    • /
    • 2011
  • For a national survey in which wide geographic region or an entire country is targeted, multi-stage sampling approach is widely used to overcome the problem of simple random sampling, to consider both herd- and animallevel factors associated with disease occurrence, and to adjust clustering effect of disease in the population in the calculation of sample size. The aim of this study was to establish sample size for estimating bovine tuberculosis (TB) in Korea using stratified two-stage sampling design. The sample size was determined by taking into account the possible clustering of TB-infected animals on individual herds to increase the reliability of survey results. In this study, the country was stratified into nine provinces (administrative unit) and herd, the primary sampling unit, was considered as a cluster. For all analyses, design effect of 2, between-cluster prevalence of 50% to yield maximum sample size, and mean herd size of 65 were assumed due to lack of information available. Using a two-stage sampling scheme, the number of cattle sampled per herd was 65 cattle, regardless of confidence level, prevalence, and mean herd size examined. Number of clusters to be sampled at a 95% level of confidence was estimated to be 296, 74, 33, 19, 12, and 9 for desired precision of 0.01, 0.02, 0.03, 0.04, 0.05, and 0.06, respectively. Therefore, the total sample size with a 95% confidence level was 172,872, 43,218, 19,224, 10,818, 6,930, and 4,806 for desired precision ranging from 0.01 to 0.06. The sample size was increased with desired precision and design effect. In a situation where the number of cattle sampled per herd is fixed ranging from 5 to 40 with a 5-head interval, total sample size with a 95% confidence level was estimated to be 6,480, 10,080, 13,770, 17,280, 20.925, 24,570, 28,350, and 31,680, respectively. The percent increase in total sample size resulting from the use of intra-cluster correlation coefficient of 0.3 was 22.2, 32.1, 36.3, 39.6, 41.9, 42.9, 42,2, and 44.3%, respectively in comparison to the use of coefficient of 0.2.

국민영양조사(國民營養調査)를 위한 표본설계(標本設計) 소고(小考) (A Sample Design for National Nutrition Servey)

  • 전태윤;정기혜
    • Journal of Nutrition and Health
    • /
    • 제17권3호
    • /
    • pp.236-241
    • /
    • 1984
  • In order to make clear the relationship between sample design and sample survey in community, it was conducted research on sample design for National Nutrition Survey in 1983. In this paper it was tried to analize the data based on The Report of a Settled Population, 1981 conducted by National Bureau of Statistics Economic Planning Board. The sample was basically using stratified two-stage sampling with systematic sampling of Ban or Li as administrative unit. The population represents the whole nation excluding Jeju-do because of budget. The selection of sampling unit and sampling procedure was as follows. 1) Stratify the nation-wide area in 20 sections according to administrative districts. 2) Determine the sample size in each section according to equal proportional rate (1 / 8040) and to about 1,000 households in the sample. 3) Select the 25 sampling units by section according to households proportion. 4) Select the 10 households at random from each Ban or Li according to equal probability proportion as the final sampling unit. Using the procedure, it was sampled 1,000 households for National Nutrition Survey in 1983.

  • PDF

Development of a Sampling Strategy and Sample Size Calculation to Estimate the Distribution of Mammographic Breast Density in Korean Women

  • Jun, Jae Kwan;Kim, Mi Jin;Choi, Kui Son;Suh, Mina;Jung, Kyu-Won
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권9호
    • /
    • pp.4661-4664
    • /
    • 2012
  • Mammographic breast density is a known risk factor for breast cancer. To conduct a survey to estimate the distribution of mammographic breast density in Korean women, appropriate sampling strategies for representative and efficient sampling design were evaluated through simulation. Using the target population from the National Cancer Screening Programme (NCSP) for breast cancer in 2009, we verified the distribution estimate by repeating the simulation 1,000 times using stratified random sampling to investigate the distribution of breast density of 1,340,362 women. According to the simulation results, using a sampling design stratifying the nation into three groups (metropolitan, urban, and rural), with a total sample size of 4,000, we estimated the distribution of breast density in Korean women at a level of 0.01% tolerance. Based on the results of our study, a nationwide survey for estimating the distribution of mammographic breast density among Korean women can be conducted efficiently.