DOI QR코드

DOI QR Code

Prediction of Covid-19 confirmed number of cases using SARIMA model

SARIMA모형을 이용한 코로나19 확진자수 예측

  • Kim, Jae-Ho (Department of Computer Science, The University of Suwon) ;
  • Kim, Jang-Young (Department of Computer Science, The University of Suwon)
  • Received : 2021.11.04
  • Accepted : 2021.12.20
  • Published : 2022.01.31

Abstract

The daily number of confirmed cases of Coronavirus disease 2019(COVID-19) ranges between 1,000 and 2,000. Despite higher vaccination rates, the number of confirmed cases continues to increase. The Mu variant of COVID-19 reported in some countries by WHO has been identified in Korea. In this study, we predicted the number of confirmed COVID-19 cases in Korea using the SARIMA for the Covid-19 prevention strategy. Trends and seasonality were observed in the data, and the ADF Test and KPSS Test was used accordingly. Order determination of the SARIMA(p,d,q)(P, D, Q, S) model helped in extracting the values of p, d, q, P, D, and Q parameters. After deducing the p and q parameters using ACF and PACF, the data were transformed and schematized into stationary forms through difference, log transformation, and seasonality removal. If seasonality appears, first determine S, then SARIMA P, D, Q, and finally determine ARIMA p, d, q using ACF and PACF for the order excluding seasonality.

코로나19의 일일 확진자 수는 천명 후반대에서 2천명대를 유지하고 있으며, 백신접종률이 증가함에도 불구하고 확진자수가 쉽게 줄어들지 않는 상황이다. 변이바이러스는 계속해서 등장하고, 현재는 뮤 변이 바이러스까지 국내에 유입되었다. 본 논문은 코로나 예방전략을 위해 SARIMA 모델을 통해 코로나19 국내 확진자 수를 예측한다. ADF Test와 KPSS Test를 통해 데이터에 추세와 계절성이 있음을 확인한다. SARIMA(p,d,q)(P,D,Q,S)의 p, d, q, P, D, Q의 값은 모형 차수결정 정리로 파라미터를 추출한다. ACF와 PACF를 통해 p, q 파라미터를 추론한다. 차분, 로그변환, 계절성제거 등을 통해 데이터를 정상성 형태로 변환하고, 도식화 하여 파라미터를 도출하고, 계절성이 있다면 S를 정하고, SARIMA P,D,Q를 정하고, 계절성을 제외한 차수에 대해 ACF와 PACF를 보고 ARIMA p,d,q를 정한다.

Keywords

References

  1. K. Laiton-Donato, C. Franco-Munoz, D. A. Alvarez-Diaz, H. A. Ruiz-Moreno, J. A. Usme-Ciro, D. A. Prada, J. Reales-Gonzalez, S. Corchuelo, M. T. Herrera-Sepulveda, J. Naizaque, G. Santamaria, J. Rivera, P. Rojas, J. H. Ortiz, A. Cardona, D. Malo, F. Prieto-Alvarado, F. R. Gomez, M. Wiesner, M. L. O. Martinez, and M. Mercado-Reyes, "Characterization of the emerging B.1.621 variant of interest of SARS- CoV-2," Infection, Genetics and Evolution, vol. 95, pp. 105038, Nov. 2021. https://doi.org/10.1016/j.meegid.2021.105038
  2. National Center for Immunization and Respiratory Diseases (NCIRD). Science Brief: Transmission of SARS-CoV-2 in K-12 Schools and Early Care and Education Programs [Internet]. Available: https://www.ncbi.nlm.nih.gov/books/NBK570438/.
  3. CDC Public Health Science Agenda for COVID-19 [Internet]. Available: https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/fully-vaccinated-people.html.
  4. J. H. Kim and J. Y. Kim, "Covid19 trends predictions using time series data," Journal of the Korea Institute of Information and Communication Engineering, vol. 25, no. 7, pp. 884-889, Dec. 2021. https://doi.org/10.6109/JKIICE.2021.25.7.884
  5. J. S. Yoo and J. G. Choo, "A Study on the Test and Visualization of Change in Trends associated with the Occurrence of Non-stationary of Long-term Time Series Data based on Unit Root Test," KIPS Transactions on Software and Data Engineering, vol. 8, no. 7, pp. 289-302, Jul. 2019. https://doi.org/10.3745/KTSDE.2019.8.7.289
  6. Kaggle data set, covid19-data-from-john-hopkins-university [Internet]. Available: https://www.kaggle.com/antgoldbloom/covid19-data-from-john-hopkins-university.
  7. T. J. Kline, "Psychological testing: A practical approach to design and evaluation," Sage Publications, 2005.
  8. The progress of the COVID-19 pandemic in Korea [Internet]. Available: https://www.joongang.co.kr/article/25010254#home.