DOI QR코드

DOI QR Code

A study on time series linkage in the Household Income and Expenditure Survey

가계동향조사 지출부문 시계열 연계 방안에 관한 연구

  • 김시현 (중앙대학교 통계학과) ;
  • 성병찬 (중앙대학교 통계학과) ;
  • 최영근 (숙명여자대학교 통계학과) ;
  • 여인권 (숙명여자대학교 통계학과)
  • Received : 2022.05.11
  • Accepted : 2022.06.02
  • Published : 2022.08.31

Abstract

The Household Income and Expenditure Survey is a representative survey of Statistics Korea, which aims to measure and analyze national income and consumption levels and their changes by understanding the current state of household balances. Recently, the disconnection problem in these time series caused by the large-scale reorganization of the survey methods in 2017 and 2019 has become an issue. In this study, we model the characteristics of the time series in the Household Income and Expenditure Survey up to 2016, and use the modeling to compute forecasts for linking the expenditures in 2017 and 2018. In order to evenly reflect the characteristics across all expenditure item series and to reduce the impact of a specific forecast model, we synthesize a total of 8 models such as regression models, time series models, and machine learning techniques. In particular, the noteworthy aspect of this study is that it improves the forecast by using the optimal combination technique that can exactly reflect the hierarchical structure of the Household Income and Expenditure Survey without loss of information as in the top-down or bottom-up methods. As a result of applying the proposed method to forecast expenditure series from 2017 to 2019, it contributed to the recovery of time series linkage and improved the forecast. In addition, it was confirmed that the hierarchical time series forecasts by the optimal combination method make linkage results closer to the actual survey series.

가계동향조사는 가구에 대한 가계수지 실태를 파악하여 국민 소득·소비 수준과 그 변화의 측정 및 분석 등을 목적으로 하는 통계청의 대표적인 조사이다. 최근 여러 기관들에서 2017년과 2018년의 가계동향 지출부문에서 발생한 시계열 단절에 대한 문제를 인식하고, 이 기간에 대한 시계열 연계를 위한 관련 연구를 진행하고 있다. 본 연구에서는 2016년까지의 가계동향 조사 시계열 특성을 파악하고, 이를 반영하여 2017년과 2018년의 지출액에 대한 시계열을 연계하는 예측값을 도출한다. 본 연구에서는 각 지출 항목들의 시계열적 특성을 골고루 반영하는 동시에 특정 예측 모형의 영향을 줄이기 위하여 총 8개의 회귀모형, 시계열모형, 머신러닝 기법을 합성하여 사용하였다. 특히 본 연구의 주목할 만한 특징은, Top-down 또는 Bottom-up 방식이 아닌, 정보의 손실없이 가계동향조사의 계층 구조를 반영할 수 있는 optimal combination 기법을 사용하여 예측력을 향상시켰다는 점이다. 2017년부터 2019년 자료에 대한 가계동향 지출 부문의 연계 분석 결과, 본 연구가 제안하는 연계 방식이 시계열 단절성 회복 및 예측력 향상에 기여하며, 또한 optimal combination 기법에 의한 계층 조정 후의 예측값이 조사자료에 보다 근접한 결과를 보여줌을 확인하였다.

Keywords

Acknowledgement

이 연구는 통계청 사회통계작성 일반연구 지원을 받아 수행된 연구임(3000-3032-304-260-01).

References

  1. Box GEP and Jenkins GM (1970). Time Series Analysis Forecasting and Control, Holden-Day, Inc., San Francisco.
  2. Breiman L (2001). Random forests, Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
  3. Duan N (1983). Smearing estimate - A nonpar ametric retransformation method, Journal of the American Statistical Association, 78, 605-610. https://doi.org/10.1080/01621459.1983.10478017
  4. Dunn DM, Williams WH, and DeChaine TL (1976). Aggregate versus subaggregate models in local area forecasting, Journal of the American Statistical Association, 71, 68-71 https://doi.org/10.1080/01621459.1976.10481478
  5. Hamilton JD (1994). Time Series Analysis. Princeton University Press, Princeton.
  6. Hong Y and Park M (2019). A study on the linked time series methods according to the Household Income and Expenditure Survey Reorganization, SRI Open-Access Research Reports 2019.
  7. Hyndman RJ and Athanasopoulos G (2018). Forecasting: Principles and Practice (2nd Ed), OTexts.
  8. Hyndman RJ, Ahmed RA, Athanasopoulos G, and Shang HL (2011). Optimal combination forecasts for hierarchical time series, Computational Statistics and Data Analysis, 55, 2579-2589. https://doi.org/10.1016/j.csda.2011.03.006
  9. Kwiatkowski D, Phillips PCB, Schmidt P, and Shin Y (1992). Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root?, Journal of Econometrics, 54, 159-178. https://doi.org/10.1016/0304-4076(92)90104-Y
  10. Kwon S and Hong Y (2019). A study on annual statistics production plans according to the Household Income and Expenditure Survey Reorganization, SRI Open-Access Research Reports 2019-21.
  11. Lim K and Park S (2016), A study on ways to improve Household Income and Expenditure Survey, Research on Improvement of Household Income and Expenditure Survey, p1-51, Statistics Research Institute.
  12. Orcutt GH, Watts HW, and Edwards JB (1968). Data aggregation and information loss, The American Economic Review, 58, 773-787
  13. Park M and Nassar M (2014). Variational Bayesian inference for forecasting hierarchical time series, Divergence Methods in Probabilistic Inference (DMPI) workshop at International Conference on Machine Learning (ICML), Beijing, China.
  14. Shlifer E and Wolff RW (1979). Aggregation and proration in forecasting, Management Science, 25, 594-603. https://doi.org/10.1287/mnsc.25.6.594
  15. Wickramasuriya SL, Athanasopoulos G, and Hyndman RJ (2019). Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization. Journal of the American Statistical Association, 114, 804-819 https://doi.org/10.1080/01621459.2018.1448825