Browse > Article
http://dx.doi.org/10.7465/jkdi.2014.25.5.1025

Efficient strategy for the genetic analysis of related samples with a linear mixed model  

Lim, Jeongmin (Chunlab, Inc.)
Sung, Joohon (Department of Public Health Science, Seoul National University)
Won, Sungho (Department of Public Health Science, Seoul National University)
Publication Information
Journal of the Korean Data and Information Science Society / v.25, no.5, 2014 , pp. 1025-1038 More about this Journal
Abstract
Linear mixed model has often been utilized for genetic association analysis with family-based samples. The correlation matrix for family-based samples is constructed with kinship coefficient and assumes that parental phenotypes are independent and the amount of correlations between parent and offspring is same as that of correlations between siblings. However, for instance, there are positive correlations between parental heights, which indicates that the assumption for correlation matrix is often violated. The statistical validity and power are affected by the appropriateness of assumed variance covariance matrix, and in this thesis, we provide the linear mixed model with flexible variance covariance matrix. Our results show that the proposed method is usually more efficient than existing approaches, and its application to genome-wide association study of body mass index illustrates the practical value in real data analysis.
Keywords
Average information method; genome-wide association study; linear mixed model; Newton-Raphson method; restricted maximum likelihood;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., et al. (2009). Finding the missing heritability of complex diseases. Nature, 461, 747-753.   DOI   ScienceOn
2 Neudecker, H. and Magnus, J. R. (1999). Matrix differential calculus with applications in statistics and econometrics, 2nd Ed., Wiley, New York.
3 Patterson, H. D. and Thompson, R. (1971). Recovery of inter-block information when block sizes are unequal. Biometrika, 58, 545.   DOI   ScienceOn
4 Smyth, G. K. and Verbyla, A. P. (1996). A conditional likelihood approach to residual maximum likelihood estimation in generalized linear models. Journal of the Royal Statistical Society B, 58, 572.
5 Stoline, M. R. (1981). The status of multiple comparisons: Simultaneous estimation of all pairwise comparisons in one-way ANOVA designs. The American Statistician, 35, 134-141.
6 Valdar, W., Solberg, L. C., Gauguier, D., Burnett, S., Klenerman, P., Cookson, W. O., Taylor, M. S., Rawlins, J. N. P., Mott, R. and Flint, H. (2006). Genome-wide genetic association of complex traits in heterogeneous stock mice. Nature Genetics, 38, 879-887.   DOI   ScienceOn
7 Zhou, X. and Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association studies. Nature Genetics, 44, 821-824.   DOI   ScienceOn
8 Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56, 52-64.   DOI   ScienceOn
9 Corbeil, R. R. and Searle, S. R. (1976). Restricted maximum likelihood (REML) estimation of variance components in the mixed model. Technometrics, 18, 31-38.   DOI   ScienceOn
10 Diggle, P., Heagerty, P., Liang, K. Y. and Zeger, S. (2002). Analysis of longitudinal data, 2nd Ed., Oxford University Press, USA.
11 Falconer, D. S., Mackay, T. F. and Frankham, R. (1996). Introduction to quantitative genetics (4th edn). Trends in Genetics, 12, 280.   DOI   ScienceOn
12 Gilmour, A. R., Thompson, R. and Cullis, B. R. (1995). Average information REML: An efficient algorithm for variance parameter estimation in linear mixed models. Biometrics, 51, 1440-1450.   DOI   ScienceOn
13 Jennrich, R. I., and Sampson, P. F. (1976). Newton-Raphson and related algorithms for maximum likelihood variance component estimation. Technometrics, 18, 11-17.   DOI   ScienceOn
14 Kenward, M. G. and Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53, 983-997.   DOI
15 Laird, N. M. andWare, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963-974.   DOI   ScienceOn
16 Lee, J. (2010). Genetic variation and diseases, 2nd Ed., World Science, Korea.
17 Lindstrom, M. J. and Bates, D. M. (1988). Newton-Raphson and EM algorithms for linear mixed-effects models for repeated-measures data. Journal of the American Statistical Association, 83, 1014-1022.
18 Kang, H. M., Sul, J. H., Service, S. K., Zaitlen, N. A., Kong, S. Y., Freimer, N. B., Sabatti, C. and Eskin, E. (2010). Variance component model to account for sample structure in genome-wide association studies. Nature Genetics, 42, 348-354.   DOI   ScienceOn
19 Sung, J., Cho, S. I., Lee, K., Lee, M., Ha, M., Choi, E. Y., Choi, J. S., Kim, H. K., et al. (2006), Healthy twin: A twin-family study of Korea-protocols and current status. Twin Research and Human Genetics, 9, 844-848.   DOI