DOI QR코드

DOI QR Code

The effect of missing levels of nesting in multilevel analysis

  • Park, Seho (Department of Biostatistics and Health Data Science, Indiana University School of Medicine) ;
  • Chung, Yujin (Department of Applied Statistics, Kyonggi University)
  • Received : 2022.08.22
  • Accepted : 2022.09.22
  • Published : 2022.09.30

Abstract

Multilevel analysis is an appropriate and powerful tool for analyzing hierarchical structure data widely applied from public health to genomic data. In practice, however, we may lose the information on multiple nesting levels in the multilevel analysis since data may fail to capture all levels of hierarchy, or the top or intermediate levels of hierarchy are ignored in the analysis. In this study, we consider a multilevel linear mixed effect model (LMM) with single imputation that can involve all data hierarchy levels in the presence of missing top or intermediate-level clusters. We evaluate and compare the performance of a multilevel LMM with single imputation with other models ignoring the data hierarchy or missing intermediate-level clusters. To this end, we applied a multilevel LMM with single imputation and other models to hierarchically structured cohort data with some intermediate levels missing and to simulated data with various cluster sizes and missing rates of intermediate-level clusters. A thorough simulation study demonstrated that an LMM with single imputation estimates fixed coefficients and variance components of a multilevel model more accurately than other models ignoring data hierarchy or missing clusters in terms of mean squared error and coverage probability. In particular, when models ignoring data hierarchy or missing clusters were applied, the variance components of random effects were overestimated. We observed similar results from the analysis of hierarchically structured cohort data.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2021R1C1C1011250).

References

  1. Greenland S. Principles of multilevel modelling. Int J Epidemiol 2000;29:158-167. https://doi.org/10.1093/ije/29.1.158
  2. Bryk AS, Raudenbush SW. Hierarchical Linear Models. Newbury Park: Sage Publications, 1992.
  3. Searle SR, Casella G, McCulloch CE. Variance Components. New York: John Wiley & Sons Inc., 1992.
  4. Goldstein H. Multilevel Statistical Models. 2nd ed. London: Edward Arnold, 1995.
  5. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. New York: Chapman and Hall, 1995.
  6. Kreft IG, de Leeuw J. Introducing Multilevel Modelling. London: Sage Publications, 1998.
  7. O'Malley AJ, Park S. A novel cluster sampling design that couples multiple surveys to support multiple inferential objectives. Health Serv Outcomes Res Methodol 2020;20:85-110. https://doi.org/10.1007/s10742-020-00210-y
  8. Clarke P. When can group level clustering be ignored? Multilevel models versus single-level models with sparse data. J Epidemiol Community Health 2008;62:752-758. https://doi.org/10.1136/jech.2007.060798
  9. Marston L, Peacock JL, Yu K, Brocklehurst P, Calvert SA, Greenough A, et al. Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets. Paediatr Perinat Epidemiol 2009;23:380-392. https://doi.org/10.1111/j.1365-3016.2009.01046.x
  10. Moineddin R, Matheson FI, Glazier RH. A simulation study of sample size for multilevel logistic regression models. BMC Med Res Methodol 2007;7:34. https://doi.org/10.1186/1471-2288-7-34
  11. McNeish DM, Stapleton LM. The effect of small sample size on two-level model estimates: a review and illustration. Educ Psychol Rev 2016;28:295-314. https://doi.org/10.1007/s10648-014-9287-x
  12. Stegmueller D. How many countries for multilevel modeling? A comparison of frequentist and Bayesian approaches. Am J Polit Sci 2013;57:748-761. https://doi.org/10.1111/ajps.12001
  13. Hill PW, Goldstein H. Multilevel modeling of educational data with cross-classification and missing identification for units. J Educ Behav Stat 1998;23:117-128. https://doi.org/10.3102/10769986023002117
  14. Goldstein H, Carpenter JR. Multilevel multiple imputation. In: Handbook of Missing Data Methodology (Molenberghs G, Fitzmaurice G, Kenward MG, Tsiatis A, Verbeke G, eds.). Boca Raton: CRC Press, 2015. pp. 295-316.
  15. Ludtke O, Robitzsch A, Grund S. Multiple imputation of missing data in multilevel designs: a comparison of different strategies. Psychol Methods 2017;22:141-165. https://doi.org/10.1037/met0000096
  16. Grund S, Ludtke O, Robitzsch A. Multiple imputation of missing data for multilevel models: simulations and recommendations. Organ Res Methods 2018;21:111-149. https://doi.org/10.1177/1094428117703686
  17. Drechsler J. Multiple imputation of multilevel missing data: rigor versus simplicity. J Educ Behav Stat 2015;40:69-95. https://doi.org/10.3102/1076998614563393
  18. Van Buuren S. Multiple imputation of multilevel data. In: Handbook of Advanced Multilevel Analysis (Hox JJ, Roberts JK, eds.). New York: Routledge, 2011. pp. 173-196.
  19. Sanders EA. Multilevel analysis methods for partially nested cluster randomized trials. Ph.D. Dissertation. Seattle: University of Washington, 2011.
  20. Mundy LK, Simmons JG, Allen NB, Viner RM, Bayer JK, Olds T, et al. Study protocol: the Childhood to Adolescence Transition Study (CATS). BMC Pediatr 2013;13:160. https://doi.org/10.1186/1471-2431-13-160
  21. Wijesuriya R, Moreno-Betancur M, Carlin JB, Lee KJ. Evaluation of approaches for multiple imputation of three-level data. BMC Med Res Methodol 2020;20:207. https://doi.org/10.1186/s12874-020-01079-8
  22. Bell BA, Ferron JM, Kromrey JD. Cluster size in multilevel models: the impact of sparse data structures on point and interval estimates in two-level models. In: JSM Proceedings, Section on Survey Research Methods. Alexandria: American Statistical Association, 2008. pp. 1122-1129.