DOI QR코드

DOI QR Code

공식발표 통계지표의 적시성 확보를 위한 대안 데이터 파이프라인 구축제안

Proposal an Alternative Data Pipeline to Secure the Timeliness for Official Statistical Indicators

  • 조용복 (동아대학교 경영정보학과) ;
  • 김도완 (한성대학교 사회과학부)
  • 투고 : 2023.08.28
  • 심사 : 2023.10.01
  • 발행 : 2023.10.30

초록

본 연구는 공식발표 통계지표의 적시성 확보를 위해 기존 Nowcasting 방법론을 살펴보고 실시간 경기 현황 분석이 가능한 Real-time nowcasting 모형을 운용하기 위한 대안 데이터와 그 수집 체계를 점검한다. 공공영역과 민간영역에서 경기지표를 예측할 수 있는 고빈도 실시간 데이터를 탐색하고, 나아가 데이터의 수집, 가공, 모형화를 위한 클라우드 기반의 구축과정을 제안한다. 더불어 Real-time nowcasting 모형 추정 및 데이터 관리에 있어 고려해야 할 요소를 확인함으로써 적시성 및 안정성을 갖춘 공식 통계지표의 예측 프로세스를 제시한다.

This study provides a comprehensive analysis of recent studies conducted on the topic of nowcasting in order to enhance the accuracy and promptness of official statistical data. Furthermore, we propose an alternative approach involving the utilization of real-time data and its corresponding collection methods to effectively operate a real-time nowcasting model capable of accurately capturing the current economic condition. We explore high-frequency real-time data that can predict economic indicators in both the public and private sectors and propose a pipeline for data collection processing and modeling that is based on cloud platforms. Furthermore we validate the essential elements required for the implementation of real-time nowcasting, as well as their data management protocols to ensure the reliability and consistency needed for accurate forecasting of official statistical indicators.

키워드

과제정보

이 논문은 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임(No. RS-2023-00214172).

참고문헌

  1. Adyan M. R., Choi, H. R. and Lim, S. B. (2018). Korean and English Sentiment Analysis Using the Deep Learning. Journal of Korea Society of Industrial Information Systems, 23(3), 59-71.
  2. Ann, Y. G. and Lee, J. W. (2017). An Empirical Study of Continental Port Traffic Impact on World Economy: Focusing on Container Volume. Korea Logistics Review, 27(1), 1-10.
  3. Aragon, F. M., Rud, J. P. and Toews, G. (2018). Resource Shocks, Employment, and Gender: Evidence from the Collapse of the UK Coal Industry. Labour Economics, 52, 54-67. https://doi.org/10.1016/j.labeco.2018.03.007
  4. Bok, B., Caratelli, D., Giannone, D., Sbordone, A. M. and Tambalotti, A. (2018). Macroeconomic Nowcasting and Forecasting with big data. Annual Review of Economics, 10, 615-643. https://doi.org/10.1146/annurev-economics-080217-053214
  5. Chang, J. H. (2015). Mining Highly Attention Itemsets using a Two-way Decay Mechanism in Data Stream Mining. Journal of Korea Society of Industrial Information Systems, 20(2), 1-9. https://doi.org/10.9723/jksiis.2015.20.2.001
  6. Cho, J. H. and Kang, M, O. (2012). Causal Relationship between Electricity Consumption and Economic Growth in Korea. Environmental and Resource Economics Review, 21(3), 573-593.
  7. Cho, Y., and Lee, Y. (2022). Asymmetric Asset Correlation in Credit Portfolios. Finance Research Letters, 49, 103037.
  8. Choi, B. H. (2007). A Study on Causality between Trading Volume of Freight and Industrial Growth in Korea Ports. JOURNAL OF KOREA PORT ECONOMIC ASSOCIATION, 23(4), 159-175.
  9. Choi, H. and Varian, H. (2012). Predicting the Present with Google Trends. Economic record, 88, 2-9. https://doi.org/10.1111/j.1475-4932.2012.00809.x
  10. Choi, K. S. (2019). K-SuperCast: A Big Data based GDP Forecasting Model. Journal of the Korean Data And Information Science Sociaty, 30(4), 723-743. https://doi.org/10.7465/jkdi.2019.30.4.723
  11. Chung, J. M. and Park. Y. H. (2022). Social Media Bigdata Analysis Based on Information Security Keyword Using Text Mining, Journal of Korea Society of Industrial Information Systems, 27(5), 37-48.
  12. Chung, S. Y., Baek, S. K. and Kang, J. K. (2010). Analysis of Metropolitan Freeway Traffic Pattems Using TCS Data, The Korea Institute of Intelligent Transport Systems, proceding, 79-83
  13. Clements, M. P. and Galvao, A. B. (2008). Macroeconomic Forecasting with Mixed-frequency Data: Forecasting Output Growth in the United States. Journal of Business & Economic Statistics, 26(4), 546-554. https://doi.org/10.1198/073500108000000015
  14. Couch, K. A. and Fairlie, R. (2010). Last Hired, First Fired? Black-white Unemployment and the Business Cycle. Demography, 47, 227-247. https://doi.org/10.1353/dem.0.0086
  15. Croushore, D. and Stark, T. (2003). A Real-time Data Set for Macroeconomists: Does the Data Vintage Matter?. Review of Economics and Statistics, 85(3), 605-617. https://doi.org/10.1162/003465303322369759
  16. Ferrara, L., Guegan, D. and Rakotomarolahy, P. (2010). GDP Nowcasting with Ragged-edge Data: A Semi-Parametric Modeling. Journal of Forecasting, 29(1-2), 186-199. https://doi.org/10.1002/for.1159
  17. Fraiberger, S. P., Lee, D., Puy, D. and Ranciere, R. (2021). Media Sentiment and International Asset Prices. Journal of International Economics, 133, 103526.
  18. Furukawa, K., Hisano, R., Minoura, Y. and Yagi, T. (2022). A Nowcasting Model of Industrial Production using Alternative Data and Machine Learning Approaches (No. 22-E-16). Bank of Japan.
  19. Hong, S. W. and Min, S. H. (2021). Market Analysis of Key Manufacturing Segments Using News Data. KIET Monthly Industrial Economics, 270, 24-34.
  20. Hong, T. H, Kim, E. M. and Cha, E J. (2017). The Prediction of Dow Jones and S&P500 Index Using SVM and News Sentiment Analysis. The Journal of Internet Electronic Commerce Research, 17(1), 23-36.
  21. Hopp, D. (2022). Benchmarking Econometric and Machine Learning Methodologies in Nowcasting: UNCTAD Research Paper No. 83.
  22. Hwang, Y. J. (2017). Business Cycles Analysis of Daegu-Gyeongbuk Province Using Mixed-Frequency Data. Journal of Economics Studies, 35(1), 115-144. https://doi.org/10.30776/JES.35.1.6
  23. Kim, D. H., Won, M. S., Song, T. J. and Hong, J. Y. (2022). Development of Faster Consumption Trend Indicator Using Traffic Mobility Big Data - Focusing on Retail Sales Index -. Journal of Transport Research, 29(4), 35-51. https://doi.org/10.34143/JTR.2022.29.4.35
  24. Kim, D. Y. and Lee, Y. I. (2018). News based Stock Market Sentiment Lexicon Acquisition Using Word2Vec. The Korea Journal of BigData, 3(1), 13-20. https://doi.org/10.36498/kbigdt.2018.3.1.13
  25. Kim, H. H. and Swanson, N. R. (2018). Methods for Backcasting, Nowcasting and Forecasting using Factor-MIDAS: With an Application to Korean GDP. Journal of Forecasting, 37(3), 281-302. https://doi.org/10.1002/for.2499
  26. Kim, H. J. (2006). A Case Study of Difference Analysis Between Tentative and Definitive Statistics using Statistical Techniques. Quarterly National Accounts, 2(25), 100-118
  27. Kim, J. (2021). Industrial Growth Elasticity of Employment in Korea: Comparison between Pre and Post-Global Financial Crisis Era. Ordo Economics Journal, 24(2), 43-66. https://doi.org/10.20436/OEJ.24.2.043
  28. Relationship between Economic Change and Air Passenger Demand: Focus on Incheon International Airport. Journal of the Korean Society for Aeronautical Science and Flight operation, 27(4), 52-64.
  29. Kim, S. H., Baek, S. K. and Han, D. H. (2010). Relationship between Expressway Traffic Volume and Economic Growth Rate, Transportation Technology and Policy, 7(4), 75-79
  30. Kim, S., Lee, Y. J., Shin, J. and Park, K. Y. (2019). Text Mining for Economic Analysis. Available at SSRN 3405781.
  31. Kwon, C. M, Hwang, S. W. and Jung J. U. (2015). Application of Web Query Information for Forecasting Korean Unemployment Rate. JOURNAL OF THE KOREA SOCIETY FOR SIMULATION, 24(2), 31-39. https://doi.org/10.9709/JKSS.2015.24.2.031
  32. Lee, C. D. (2022). Prediction on Busan's Gross Product and Employment of Major Industry with Logistic Regression and Machine Learning Model. Korea Trade Review, 47(2), 69-88. https://doi.org/10.22659/KTRA.2022.47.2.69
  33. Lee, G. H. and Hwang, S. P. (2014). Business Cycle Indicator Using Big Data: Compilation of the Naver Search Business Index, Economic Analysis, 20(4), 1-38
  34. Lee, K. H., Choi, H. S. and Chung, Y. D. (2011). Massive Data Processing and Management in Cloud Computing: A Survey, Journal of KI ISE : Databases, 38(2), 104-125.
  35. Lee, N. Y. and Lee, K. Y. (2022). A Study on the Advancement of the Container Port Throughput Forecasting Model. Korea Maritime Institute. 19(01)
  36. Lee, S. I. (2021). Development of a GRDP Prediction Model Using Deep Learning: A Case Study of Ulsan, Korea. Available at SSRN 3941637.
  37. Lee, S. Y., Lee, E. K and Han, J. H. (2017). Short-term Forecasting System using Machine Learning and MIDAS Model. BOK Monthly Bulletin, 71(5), 16-41.
  38. Lewis, D. J., Mertens, K., Stock, J. H. and Trivedi, M. (2021, May). High-frequency Data and A Weekly Economic Index During the Pandemic. In AEA Papers and Proceedings (Vol. 111, 326-330). American Economic Association.
  39. Lewis, D. J., Mertens, K., Stock, J. H. and Trivedi, M. (2022). Measuring Real Activity using A Weekly Economic Index. Journal of Applied Econometrics, 37(4), 667-687. https://doi.org/10.1002/jae.2873
  40. Lim, J. H., Kim, Y. R. and Choi, Y. C. (2020). A Study on Factors Affecting Korea's Air Freight Trade Field. Journal of the Korean Society for Aviation and Aeronautics, 28(4), 117-124. https://doi.org/10.12985/ksaa.2020.28.4.117
  41. Lim, S. S and Choi, H. S. (2021). Real-time Forecasting of Real GDP Using Text Mining. Journal of Corporation and Innovation, 44(4), 91-106. https://doi.org/10.22778/JCI.2021.44.4.91
  42. Longo, L., Riccaboni, M. and Rungi, A. (2022). A Neural Network Ensemble Approach for GDP Forecasting. Journal of Economic Dynamics and Control, 134, 104278.
  43. Luo, C., Li, X., Ye, Y., Feng, S. and Ng, M. K. (2022). Experimental Study on Generative Adversarial Network for Precipitation Nowcasting. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-20. https://doi.org/10.1109/TGRS.2022.3177625
  44. Matsumura, K., Oh, Y., Sugo, T., & Takahashi, K. (2021). Nowcasting Economic Activity with Mobility data. Available at SSRN 4406104.
  45. McLaren, N. and Shanbhogue, R. (2011). Using Internet Search Data as Economic Indicators. Bank of England Quarterly Bulletin, (2011), Q2.
  46. Miller, S., Moat, H. S., & Preis, T. (2020). Using Aircraft Location Data to Estimate Current Economic Activity. Scientific reports, 10(1), 7576.
  47. Mo, S. W. and Lee, K. B. (2013). Causality Analysis between Port Trading Volume and Industrial Activity. Journal of Shipping and Logistics, 29(2), 221-235. https://doi.org/10.37059/tjosal.2013.29.2.221
  48. Nakazawa, T. (2022). Constructing GDP Nowcasting Models Using Alternative Data (No. 22-E-9). Bank of Japan.
  49. Nyman, R., Kapadia, S. and Tuckett, D. (2021). News and Narratives in Financial Systems: Exploiting Big Data for Systemic Risk Assessment. Journal of Economic Dynamics and Control, 127, 104119.
  50. Park, K. H. and Kim, J. K. (2013). Sectoral Energy Consumption and Economic Growth in Korea. Korea Energy Economic Review, 12(2), 59-83. https://doi.org/10.22794/KEER.2013.12.2.003
  51. Park, M. H., Roh, G. K. and Lee, S, E. An Analysis on the Causal Relation Between Electricity Consumption and GDP by industries in KOREA. Journal of the Korean Institute of Illuminating and Electrical Installation Engineers, 30(3), 39-45.
  52. Park, S. K. (2020). Monitoring Real-time Business Conditions based on Electricity Consumption. KIET Monthly Industrial Economics, 260, 7-19.
  53. Richardson, A., van Florenstein Mulder, T. and Vehbi, T. (2021). Nowcasting GDP using Machine-learning Algorithms: A Real-time Assessment. International Journal of Forecasting, 37(2), 941-948. https://doi.org/10.1016/j.ijforecast.2020.10.005
  54. Robertson, J. C. and Tallman, E. W. (1999). Vector Autoregressions: Forecasting and Reality. Economic Review-Federal Reserve Bank of Atlanta, 84(1), 4.
  55. Seo, J. Y. (2021). The Relationship between Improvement of Private Consumption Sector and Card Usage: Evidence from Korean Card Market. The Credit Card Review, 15(3), 1-17. https://doi.org/10.35348/CCR.2021.15.3.001
  56. Shin, S. H. (2014). Analysis on Structural Change of Growth-employment Relationship in Korea. Korean Journal of Industrial Relations, 24(3), 45-61.
  57. Shin, S. J. and Seo, B. S. (2022). Real-Time Private Consumption Prediction Using Big Data. BOK Working Paper, 2022(16)
  58. Smith, M. S. and Vahey, S. P. (2016). Asymmetric Forecast Densities for US Macroeconomic Variables from a Gaussian Copula Model of Cross-sectional and Serial Dependence. Journal of Business & Economic Statistics, 34(3), 416-434.
  59. Song, M. C. and Shin, K. S. (2017). Construction of Consumer Confidence index based on Sentiment Analysis using News articles. Journal of Intelligence and Information Systems, 23(3), 1-27. https://doi.org/10.13088/jiis.2017.23.1.001
  60. Soybilgen, B. and Yazgan, E. (2021). Nowcasting US GDP using Tree-based Ensemble Models and Dynamic Factors. Computational Economics, 57, 387-417. https://doi.org/10.1007/s10614-020-10083-5
  61. Statistics Korea. (2021). Statistics Information Report for Index of All Industry Production: IAI, Statistics Korea
  62. Suimon, Y., and Yanai, M. (2021, July). Using Mobile Phone GPS Data to Nowcast Auto Production Activity and Equity Investment Strategy Based on the Activity Nowcasting. In 2021 10th International Congress on Advanced Applied Informatics (IIAI-AAI) (pp. 680-683). IEEE.
  63. Sung, B. H. (2001). Economic Forecasting using the Bayesian VAR Model. Economic Analysis, 7(2), 59-90.
  64. Wallis, K. F. (1986). Forecasting with an Econometric model: The 'ragged edge' problem. Journal of Forecasting, 5(1), 1-13. https://doi.org/10.1002/for.3980050102
  65. Wegmuller, P. and Glocker, C. (2023). US Weekly Economic Index: Replication and Extension. Journal of Applied Econometrics.
  66. Wen, Y. and Chen, M. (2012). Okun's law: A Meaningful Guide for Monetary Policy?. Economic Synopses, 2012(2012-06-08).
  67. Yi, H. C., Choi, D. K. and Kim, Y. G. (2022). Dynamic Factor Model and Deep Learning Algorithm for GDP Nowcasting. Economic Analysis, 28(2), 1-37.
  68. Zheng, H., Zhou, Z. and Chen, J. (2021). RLSTM: A New Framework of Stock Prediction by using Random Noise for Overfitting Prevention. Computational Intelligence and Neuroscience, 2021, 1-14. https://doi.org/10.1155/2021/8865816