Browse > Article
http://dx.doi.org/10.5808/gi.22052

The effect of missing levels of nesting in multilevel analysis  

Park, Seho (Department of Biostatistics and Health Data Science, Indiana University School of Medicine)
Chung, Yujin (Department of Applied Statistics, Kyonggi University)
Abstract
Multilevel analysis is an appropriate and powerful tool for analyzing hierarchical structure data widely applied from public health to genomic data. In practice, however, we may lose the information on multiple nesting levels in the multilevel analysis since data may fail to capture all levels of hierarchy, or the top or intermediate levels of hierarchy are ignored in the analysis. In this study, we consider a multilevel linear mixed effect model (LMM) with single imputation that can involve all data hierarchy levels in the presence of missing top or intermediate-level clusters. We evaluate and compare the performance of a multilevel LMM with single imputation with other models ignoring the data hierarchy or missing intermediate-level clusters. To this end, we applied a multilevel LMM with single imputation and other models to hierarchically structured cohort data with some intermediate levels missing and to simulated data with various cluster sizes and missing rates of intermediate-level clusters. A thorough simulation study demonstrated that an LMM with single imputation estimates fixed coefficients and variance components of a multilevel model more accurately than other models ignoring data hierarchy or missing clusters in terms of mean squared error and coverage probability. In particular, when models ignoring data hierarchy or missing clusters were applied, the variance components of random effects were overestimated. We observed similar results from the analysis of hierarchically structured cohort data.
Keywords
hierarchical structure data; missing levels of nesting; multilevel model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Mundy LK, Simmons JG, Allen NB, Viner RM, Bayer JK, Olds T, et al. Study protocol: the Childhood to Adolescence Transition Study (CATS). BMC Pediatr 2013;13:160.   DOI
2 Greenland S. Principles of multilevel modelling. Int J Epidemiol 2000;29:158-167.   DOI
3 Goldstein H. Multilevel Statistical Models. 2nd ed. London: Edward Arnold, 1995.
4 O'Malley AJ, Park S. A novel cluster sampling design that couples multiple surveys to support multiple inferential objectives. Health Serv Outcomes Res Methodol 2020;20:85-110.   DOI
5 Moineddin R, Matheson FI, Glazier RH. A simulation study of sample size for multilevel logistic regression models. BMC Med Res Methodol 2007;7:34.   DOI
6 Hill PW, Goldstein H. Multilevel modeling of educational data with cross-classification and missing identification for units. J Educ Behav Stat 1998;23:117-128.   DOI
7 Grund S, Ludtke O, Robitzsch A. Multiple imputation of missing data for multilevel models: simulations and recommendations. Organ Res Methods 2018;21:111-149.   DOI
8 Kreft IG, de Leeuw J. Introducing Multilevel Modelling. London: Sage Publications, 1998.
9 Bryk AS, Raudenbush SW. Hierarchical Linear Models. Newbury Park: Sage Publications, 1992.
10 Searle SR, Casella G, McCulloch CE. Variance Components. New York: John Wiley & Sons Inc., 1992.
11 Clarke P. When can group level clustering be ignored? Multilevel models versus single-level models with sparse data. J Epidemiol Community Health 2008;62:752-758.   DOI
12 Goldstein H, Carpenter JR. Multilevel multiple imputation. In: Handbook of Missing Data Methodology (Molenberghs G, Fitzmaurice G, Kenward MG, Tsiatis A, Verbeke G, eds.). Boca Raton: CRC Press, 2015. pp. 295-316.
13 Sanders EA. Multilevel analysis methods for partially nested cluster randomized trials. Ph.D. Dissertation. Seattle: University of Washington, 2011.
14 Bell BA, Ferron JM, Kromrey JD. Cluster size in multilevel models: the impact of sparse data structures on point and interval estimates in two-level models. In: JSM Proceedings, Section on Survey Research Methods. Alexandria: American Statistical Association, 2008. pp. 1122-1129.
15 Marston L, Peacock JL, Yu K, Brocklehurst P, Calvert SA, Greenough A, et al. Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets. Paediatr Perinat Epidemiol 2009;23:380-392.   DOI
16 McNeish DM, Stapleton LM. The effect of small sample size on two-level model estimates: a review and illustration. Educ Psychol Rev 2016;28:295-314.   DOI
17 Stegmueller D. How many countries for multilevel modeling? A comparison of frequentist and Bayesian approaches. Am J Polit Sci 2013;57:748-761.   DOI
18 Ludtke O, Robitzsch A, Grund S. Multiple imputation of missing data in multilevel designs: a comparison of different strategies. Psychol Methods 2017;22:141-165.   DOI
19 Drechsler J. Multiple imputation of multilevel missing data: rigor versus simplicity. J Educ Behav Stat 2015;40:69-95.   DOI
20 Van Buuren S. Multiple imputation of multilevel data. In: Handbook of Advanced Multilevel Analysis (Hox JJ, Roberts JK, eds.). New York: Routledge, 2011. pp. 173-196.
21 Wijesuriya R, Moreno-Betancur M, Carlin JB, Lee KJ. Evaluation of approaches for multiple imputation of three-level data. BMC Med Res Methodol 2020;20:207.   DOI
22 Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. New York: Chapman and Hall, 1995.