Browse > Article
http://dx.doi.org/10.29220/CSAM.2022.29.1.103

Exploring COVID-19 in mainland China during the lockdown of Wuhan via functional data analysis  

Li, Xing (Department of Statistics and Finance, University of Science and Technology of China)
Zhang, Panpan (Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania)
Feng, Qunqiang (Department of Statistics and Finance, University of Science and Technology of China)
Publication Information
Communications for Statistical Applications and Methods / v.29, no.1, 2022 , pp. 103-125 More about this Journal
Abstract
In this paper, we analyze the time series data of the case and death counts of COVID-19 that broke out in China in December, 2019. The study period is during the lockdown of Wuhan. We exploit functional data analysis methods to analyze the collected time series data. The analysis is divided into three parts. First, the functional principal component analysis is conducted to investigate the modes of variation. Second, we carry out the functional canonical correlation analysis to explore the relationship between confirmed and death cases. Finally, we utilize a clustering method based on the Expectation-Maximization (EM) algorithm to run the cluster analysis on the counts of confirmed cases, where the number of clusters is determined via a cross-validation approach. Besides, we compare the clustering results with some migration data available to the public.
Keywords
COVID-19; functional canonical correlation; functional cluster analysis; functional principal component analysis; migration;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Hyndman RJ and Ullah S (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis, 51, 4942-4956.   DOI
2 Leng X and Muller H-G (2006). Classification using functional data analysis for temporal gene expression data. Biostatistics, 22, 68-76.
3 Li Y, Wang N, and Carroll RJ (2013). Selecting the number of principal components in functional data. Journal of the American Statistical Association, 108, 1284-1294.   DOI
4 Ouyang G, Dey DK, and Zhang P (2019). Clique-based method for social network clustering. Journal of Classification, 37, 254-274.   DOI
5 Rajgor DD, Lee MH, Archuleta S, Bagdasarian N, and Quek SC (2020). The many estimates of the COVID-19 case fatality rate. The Lancet Infectious Diseases, 20, 776-777.   DOI
6 Ramsay JO (1982). When the data are functions. Psychometrika, 47, 379-396.   DOI
7 Ramsay JO and Dalzell CJ (1991). Some tools for functional data analysis. Journal of the Royal Statistical Society. Series B (Methodological), 53, 539-572.   DOI
8 Ramsay JO and Silverman BW (2005). Functional Data Analysis, Springer-Verlag, New York.
9 Ramsay JO, Hooker G, and Graves S (2009). Functional Data Analysis with R and MATLAB, Springer-Verlag New York, NY.
10 Petersen A and Muller H-G (2016). Functional data analysis for density functions by transformation to a Hilbert space. The Annals of Statistics, 44, 183-218.   DOI
11 Lee G and Scott C (2012). EM algorithms for multivariate Gaussian mixture models with truncated and censored data. Computational Statistics & Data Analysis, 56, 2816-2829.   DOI
12 Shen M, Tan H, Zhou S, Smith GN, Walker MC, and Wen SW (2017). Trajectory of blood pressure change during pregnancy and the role of pre-gravid blood pressure: A functional data analysis approach. Scientific Reports, 7, 6227.   DOI
13 Shin H and Lee S (2015). Canonical correlation analysis for irregularly and sparsely observed functional data. Journal of Multivariate Analysis, 134, 1-18.   DOI
14 Anderson KG, Ranbaut A, Lipkin WI, Holmes EC, and Garry RF (2020). The proximal origin of SARS-CoV-2. Nature Medicine, 26, 450-452.   DOI
15 Boschi T, Di Iorio J, Testa L, Cremona MA, and Chiaromonte F (2021). Functional data analysis characterizes the shapes of the first COVID-19 epidemic wave in Italy. Scientific Reports, 11, 17054.   DOI
16 Borveyron C, Celeus G, Murphy TB, and Raftery AE (2019). Model-Based Clustering and Classification for Data Science: With Applications in R, Cambridge University Press, Cambridge, UK.
17 Burns DM, Houpt JW, Townsend JT, and Endres MJ (2013). Functional principal component analysis of workload capacity functions. Behavior Research Methods, 45, 1048-1057.   DOI
18 Carroll C, Gajardo A, and Chen Y, et al. (2020) fdapace: Functional Data Analysis and Empirical Dynamics, R package version 0.5.4. https://CRAN.R-project.org/package=fdapace
19 Chen J, Yan J, and Zhang P (2020). Clustering US states by time series of COVID-19 new case counts with non-negative matrix factorization. arXiv:2011.14412
20 Chen WC and Maitra R (2019) EMCluster: EM algorithm for model-based clustering of finite mixture Gaussian distribution, R Package. http://cran.r-project.org/package=EMCluster
21 McLachlan GJ and Peel D (2000). Finite Mixture Models, New York, Wiley-Interscience.
22 Abraham C, Cornillon PA, Matzner-Lober E, and Molinari N (2003). Unsupervised curve clustering using B-splines. Scandinavian Journal of Statistics, 30, 581-595.   DOI
23 Tang B, Xia F, and Tang S, et al. (2020). The effectiveness of quarantine and isolation determine the trend of the COVID-19 epidemic in the final phase of the current outbreak in China. International Journal of Infectious Diseases, 95, 288-293.   DOI
24 Tang C, Wang T, and Zhang P (2021). Functional data analysis: An application to COVID-19 data in the United States. Quantitative Biology.
25 Ullah S and Finch CF (2013). Applications of functional data analysis: A systematic review. BMC Medical Research Methodology, 13.
26 World Health Organization (2020). Coronavirus disease 2019 (COVID-19): situation report, 114, https://apps.who.int/iris/handle/10665/332089
27 Zhu H, Wei L, and Niu P (2020). The novel coronavirus outbreak in Wuhan, China. Global Health Research and Policy, 5.
28 He G, Muller H-G, and Wang J-L (2004). Methods of canonical analysis for functional data. Journal of Statistical Planning and Inference, 122, 141-159.   DOI
29 Newman MEJ (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103, 8577-8582.   DOI
30 Baladandayuthapani V, Mallick BK, Hong MY, Lupton JR, Turner ND, and Carroll RJ (2008). Bayesian Hierarchical Spatially Correlated Functional Data Analysis with Application to Colon Carcinogenesis. Biometrics, 64, 64-73.   DOI
31 James GM and Sugar CA (2003). Clustering for sparsely sampled functional data. Journal of the American Statistical Association, 8, 397-408.   DOI
32 Li Y and Hsing T (2010). Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. The Annals of Statistics, 38, 3321-3351.   DOI
33 Rahman A and Jiang D (2021). Regional and temporal patterns of influenza: Application of functional data analysis. Infectious Disease Modelling, 6, 1061-1072.   DOI
34 Ramsay JO and Silverman BW, Applied Functional Data Analysis: Methods and Case Studies, Springer-Verlag, New York.
35 Ramsay JO, Graves S, and Hooker SG (2020) fda: Functional Data Analysis, R package version 5.1.4. https://CRAN.R-project.org/package=fda
36 Carroll C, Bhattacharjee S, and Chen Y, et al. (2020). Time dynamics of COVID-19. Scientific Reports, 10, 21040.   DOI
37 Yang W, Muller H-G, and Stadtmuller U (2011). Functional singular component analysis. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 73, 303-324.   DOI
38 Yao F, Muller H-G, and Wang J-L (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100, 577-590.   DOI
39 The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team (2020). Vital Surveillances: The Epidemiological Characteristics of an Outbreak of 2019 novel coronavirus diseases (COVID-19)-China. China CDC Weekly, 2, 113-122.   DOI
40 Dempster AP, Laird NM, and Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39, 1-22.   DOI
41 Handcock MS, Raftery AE, and Tantrum J (2007). Model-based clustering for social networks (with discussion). Journal of the Royal Statistical Society. Series A (Statistics in Society), 170, 301-354.   DOI
42 Crawford L, Monod A, Chen AX, Mukherjee S, and Rabadan R (2020). Predicting clinical outcomes in glioblastoma: An application of topological and functional data analysis. Journal of the American Statistical Association, 115, 1139-1150.   DOI
43 Fan J and Gijbels I (1996). Local Polynomial Modelling and Its Applications, Chapman and Hall, London, UK.
44 Floriello D and Vitelli V (2017). Sparse clustering of functional data. Journal of Multivariate Analysis, 154, 1-18.   DOI
45 Garcia-Escudero LA and Gordaliza A (2005). A proposal for robust curve clustering. Journal of Classification, 22, 185-201.   DOI