DOI QR코드

DOI QR Code

Choosing clusters for two-stage household surveys

가구조사를 위한 이단추출 표본설계에서의 집락선택

  • Park, Inho (Department of Statistics, Pukyong National University)
  • Received : 2016.02.17
  • Accepted : 2016.03.24
  • Published : 2016.03.31

Abstract

Two-stage sample designs are commonly used for household surveys in Korea using as clusters the enumeration districts (EDs). Since clustering decomposes the population variation into within- and between-cluster variations, the sample sizes allocated in stages can affect the overall precision. Alternative clusters are often considered due to diverse reasons such as the EDs' limitation in size, being out-of-date, and in-assessibility to their household lists. In addition, the EDs are currently under development by the Statistics Korea as an joint effort toward their transition from the traditional practice to the register census from 2015. We present an approach for evaluating the difference in the precision of the mean estimators of the sets of the cluster units in between a hierachical and nested form, where the design effect is used to reflect the effect of the clustering and the sample allocation. We also demonstrate our approach using the U.S. Census counts from the year 2000 for Anne Arundel County in Maryland. Our research shows that the within-cluster variance can be significantly different for survey variables and thus the choice of cluster units and the associated sample allocation scheme should reflect the corresponding variance decomposition due to clustering.

우리나라 가구조사는 흔히 통계청의 조사구를 집락으로 사용한 이단추출의 자체가중 표본설계의 형태로 진행된다. 집락구조는 모집단내 개체변동성을 집락간과 집락내 분산으로 분해되기 때문에 이와 연관된 표본집락수와 집락내 표본수의 결정은 표본추정에 영향을 미치게 된다. 하지만 조사구의 규모, 노후화, 가구명부 접근불가 등의 여러가지 이유로 집계구와 같은 대안적 집락선택이 고려되기도 한다. 또한 2015 인구주택총조사부터는 전통적 가구방문조사 방식에서 행정자료를 이용한 등록센서스 형태로 바뀜에 따라 기존 조사구의 형태나 규모의 변경되어 구축되는 것으로 알려져 있다. 본 논문에서는 집락추출을 반영한 설계효과식을 통해 계통적 혹은 내포적 구성을 갖는 집락들의 선택이 주는 분산식 차이를 유도하고, 주어진 표본크기에서 동일한 분산을 갖는 집락구조별 표본할당에 대해 살펴보았다. 미국 매릴랜드주 앤어룬델 카운티 자료를 사용하여 우리나라 조사구와 집계구와 다소 유사한 사례연구를 포함하였다. 조사변수별로 집락통합이 주는 동일성 계수의 변화는 같지 않으며 이에 따라 집락구조에 따른 표본할당이 집락표본수와 더불어 종합적으로 고려되어야 할 것이다.

Keywords

References

  1. Cochran, W. G. (1977). Sampling techniques, 3rd ed., John Wiley & Sons, New York.
  2. Gabler, S., Hader, S. and Lynn, P. (2006). Design effects for multiple design samples. Survey Methodology, 32, 115-120.
  3. Groves, R. M. (1989). Survey errors and survey costs, John Wiley & Sons, New York.
  4. Hansen, M. H., Hurwitz, W. N. and Madow, W. G. (1953a). Sample survey methods and theory. Volume 1: Methods and applications, Wiley, New York.
  5. Hansen, M. H., Hurwitz, W. N. and Madow, W. G. (1953b). Sample survey methods and theory. Volume 2: Methods and applications, Wiley, New York.
  6. Heo, S. (2013). Error cause analysis of Pearson test statistics for k-population homogeneity test. Journal of the Korean Data & Information Science Society, 24, 815-824. https://doi.org/10.7465/jkdi.2013.24.4.815
  7. Kalton, G., Brick, J. M. and Le, T. (2005). Estimating components of design effects for use in sample design. In household sample surveys in developing and transition countries, (Sales No. E.05.XVII.6). Department of Economic and Social Affairs, Statistics Division, United Nations, New York.
  8. Kang, H., Park, S., Kim, J., Kim, I., Lee, D., Hwang, J. and Park, M. (2009). A case study on the construction of the sampling frame and sampling design for 2008 Seoul survey. Survey Research, 10, 157-172.
  9. Kish, L. (1965). Survey sampling, John Wiley & Sons, New York.
  10. Korn, E. L. and Graubard, B. I. (1999). Analysis of health surveys, Wiley, New York.
  11. Ku, M. J, Kim, S., Kim, H. Y. and Kim, J. (2014). Creating the household sampling frame: The Korean general social survey(KGSS). Survey Research, 15, 153-174.
  12. Lee, H. (2012). How should one find out the contributions to the design effect (variance) made by each of the design components (stratification, clustering, weighting) of a complex sample design? Survey Statistician, 66, 16-20.
  13. Lee, K., Lee, M. J., Seo, U. S. and Byun, M. R. (2006). Methods used in determining enumeration districts in the 2005 population and housing census. Survey Research, 7, 109-129.
  14. Lohr, S. L. (2010). Sampling: design and analysis, 2nd ed., Brooks/Cole, Boston.
  15. Park, I. (2014). A study on design effect models for complex sample survey. Journal of the Korean Data & Information Science Society, 24, 523-531.
  16. Park, I. (2015). Understanding complex design features via design effect models. The Korean Journal of Applied Statistics, 28, 1-9. https://doi.org/10.5351/KJAS.2015.28.1.001
  17. Park, I. and Lee, H. (2004). Design effecs for the weighted mean and total estimators under complex survey sampling. Survey Methodology, 30, 183-193.
  18. Park, I., Son, C. K. and Shin, J. (2015). Sampling design and weighting for the 2015 consumer behavior survey for food, The Korean Statistical Society, Seoul.
  19. Park, J., Byun, J. and Park, M. (2010). Construction of samplng frames for the 5th Korean national health and nutrition examination survey. The Korean Journal of Applied Statistics, 23, 923-932. https://doi.org/10.5351/KJAS.2010.23.5.923
  20. Skinner, C. J., Holt, D. and Smith, T. M. F. (Eds.) (1989). Analysis of complex survey, Wiley, New York.
  21. Statistics Korea (2014). Launching the household and housing survey for successful register-census, Press Release, Daejeon.
  22. Valliant, R., Dever, J. A. and Kreuter, F. (2013). Practical tools for designing and weighting survey samples, Springer, New York.
  23. Yoon, Y. O., Kim, K. Y. and Lee, M. H. (2004). Redesigning KNSO's household survey sample. Survey Research, 5, 103-130.