DOI QR코드

DOI QR Code

Forecasting of the COVID-19 pandemic situation of Korea

  • Goo, Taewan (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Apio, Catherine (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Heo, Gyujin (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Lee, Doeun (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Lee, Jong Hyeok (Department of Statistics, Seoul National University) ;
  • Lim, Jisun (The Research Institute of Basic Sciences, Seoul National University) ;
  • Han, Kyulhee (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Park, Taesung (Interdisciplinary Program in Bioinformatics, Seoul National University)
  • Received : 2021.03.15
  • Accepted : 2021.03.24
  • Published : 2021.03.31

Abstract

For the novel coronavirus disease 2019 (COVID-19), predictive modeling, in the literature, uses broadly susceptible exposed infected recoverd (SEIR)/SIR, agent-based, curve-fitting models. Governments and legislative bodies rely on insights from prediction models to suggest new policies and to assess the effectiveness of enforced policies. Therefore, access to accurate outbreak prediction models is essential to obtain insights into the likely spread and consequences of infectious diseases. The objective of this study is to predict the future COVID-19 situation of Korea. Here, we employed 5 models for this analysis; SEIR, local linear regression (LLR), negative binomial (NB) regression, segment Poisson, deep-learning based long short-term memory models (LSTM) and tree based gradient boosting machine (GBM). After prediction, model performance comparison was evelauated using relative mean squared errors (RMSE) for two sets of train (January 20, 2020-December 31, 2020 and January 20, 2020-January 31, 2021) and testing data (January 1, 2021-February 28, 2021 and February 1, 2021-February 28, 2021) . Except for segmented Poisson model, the other models predicted a decline in the daily confirmed cases in the country for the coming future. RMSE values' comparison showed that LLR, GBM, SEIR, NB, and LSTM respectively, performed well in the forecasting of the pandemic situation of the country. A good understanding of the epidemic dynamics would greatly enhance the control and prevention of COVID-19 and other infectious diseases. Therefore, with increasing daily confirmed cases since this year, these results could help in the pandemic response by informing decisions about planning, resource allocation, and decision concerning social distancing policies.

Keywords

References

  1. Zhao H, Lu X, Deng Y, Tang Y, Lu J. COVID-19: asymptomatic carrier transmission is an underestimated problem. Epidemiol Infect 2020;148:e116. https://doi.org/10.1017/S0950268820001235
  2. Worldometer. United States: coronavirus cases. Worldometer, 2021. Accessed 2021 Mar 11. Available from: https://www.worldometers.info/coronavirus/.
  3. Hsiang S, Allen D, Annan-Phan S, Bell K, Bolliger I, Chong T, et al. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 2020;584:262-267. https://doi.org/10.1038/s41586-020-2404-8
  4. Haug N, Geyrhofer L, Londei A, Dervic E, Desvars-Larrive A, Loreto V, et al. Ranking the effectiveness of worldwide COVID-19 government interventions. Nat Hum Behav 2020;4:1303-1312. https://doi.org/10.1038/s41562-020-01009-0
  5. Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ 2020;369:m1328.
  6. Sperrin M, McMillan B. Prediction models for COVID-19 outcomes. BMJ 2020;371:m3777. https://doi.org/10.1136/bmj.m3777
  7. Santosh KC. COVID-19 prediction models and unexploited data. J Med Syst 2020;44:170. https://doi.org/10.1007/s10916-020-01645-z
  8. Centers for Disease Control and Preventions. COVID-19 mathematical modeling. Source: National Center for Immunization and Respiratory Diseases (NCIRD), Division of Viral Diseases. Atlanta: Centers for Disease Control and Preventions, 2020. Accessed 2020 May 26. Available from: https://www.cdc.gov/coronavirus/2019-ncov/coviddata/mathematical-modeling.htm.
  9. Ardabili SF, Mosavi A, Ghamisi P, Ferdinand F, Varkonyi-Koczy AR, Reuter U, et al. COVID-19 outbreak prediction with machine learning. Algorithms 2020;13:249. https://doi.org/10.3390/a13100249
  10. NeurIPS 2020: data science for COVID-19 (DS4C). DS4C: data science for COVID-19 in South Korea. San Francisco: Kagle, 2020. Accessed 2021 Mar 11. Available from: https://www.kaggle.com/kimjihoo/coronavirusdataset.
  11. Korea Information Society Agency. Ministry of Health and WelfareCorona 19 City/ProvinceStatus. Daegu: Korea Information Society Agency, 2021. Accessed 2021 Mar 11. Available from: https://data.go.kr/data/15043378/openapi.do.
  12. DeGroot MH. Probability and Statistics. 2nd ed. Reading: Addison-Wesley, 1986. pp. 258-259.
  13. tscount: analysis of count time series. Comprehensive R Archive Network, 2021. Accessed 2021 Mar 11. Available from: https://cran.r-project.org/web/packages/tscount/index.html.
  14. Tashman LJ. Out-of-sample tests of forecasting accuracy: an analysis and review. Int J Forecasting 2000;16:437-450. https://doi.org/10.1016/S0169-2070(00)00065-0
  15. Cleveland WS, Loader C. Smoothing by local regression: principles and methods. In: Statistical Theory and Computational Aspects of Smoothing. Contributions to Statistics (Hardle W, Schimek MG, eds.). Heidelberg: Physica-Verlag HD, 1996. pp. 10-49.
  16. locfit: local regression, likelihood and density estimation. Comprehensive R Archive Network, 2021. Accessed 2021 Mar 11. Available from: https://cran.r-project.org/web/packages/locfit/index.html.
  17. Chimmula VK, Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos Soliton Fract 2020;135:109864. https://doi.org/10.1016/j.chaos.2020.109864
  18. Chandra R, Jain A, Chauhan D. Deep learning via LSTM models for COVID-19 infection forecasting in India. Preprint at https://arxiv.org/abs/2101.11881 (2021).
  19. Baily NT. The Mathematical Theory of Infectious Disease and Its Applications. 2nd ed. London: Griffin, 1975.
  20. Hethcote HW. The mathematics of infectious diseases. SIAM Rev 2000;42:599-653. https://doi.org/10.1137/S0036144500371907
  21. Keeling MJ, Rohani P. Modeling Infectious Diseases in Humans and Animals. Princeton: Princeton University Press, 2008.
  22. Diekmann O, Heesterbeek H, Britton T. Mathematical Tools for Understanding Infectious Disease Dynamics. Princeton Series in Theoretical and Computational Biology. Princeton, NJ: Princeton University Press, 2013.
  23. Gumaei A, Al-Rakhami M, Al Rahhal MM, Albogamy FR, Al Maghayreh E, et al. Prediction of COVID-19 confirmed cases using gradient boosting regression method. Comput Mater Continua 2021;66:315-329.
  24. lightgbm: Light Gradient Boosting Machine. San Francison: GitHub, 2021. Accessed 2021 Mar 22. Available from: https://github.com/microsoft/LightGBM.
  25. Korea Disease Control and Prevention Agency. Cheongju: Korea Disease Control and Prevention Agency, 2021. Accessed 2021 Mar 14. Available from: http://www.kdca.go.kr/cdc_eng/.
  26. Yonhap News Agency. (3rd LD) S. Korea to impose nationwide ban on gatherings of 5 or more people in virus fight: PM. Seoul: Yonhap News Agency, 2020. Accessed 2021 Mar 11. Available from: https://en.yna.co.kr/view/AEN20201222001553315.
  27. Heo G, Apio C, Han K, Goo T, Chung HW, Kim T, et al. Statistical estimation of effects of implemented government policies on COVID-19 situation in South Korea. Int J Environ Res Public Health 2021;18:2144. https://doi.org/10.3390/ijerph18042144