DOI QR코드

DOI QR Code

t-링크를 갖는 마코프 이항 회귀 모형을 이용한 인도네시아 어린이 종단 자료에 대한 베이지안 분석

Bayesian inference of longitudinal Markov binary regression models with t-link function

  • Sim, Bohyun (Department of Statistics, Pusan National University) ;
  • Chung, Younshik (Department of Statistics, Pusan National University)
  • 투고 : 2019.11.21
  • 심사 : 2020.01.08
  • 발행 : 2020.02.29

초록

본 논문에서는 마코프 이항 회귀 모형의 시차가 알려져 있거나 그렇지 않은 경우일 때, t-링크 함수를 갖는 종단적 마코프 이항 회귀 모형을 제시한다. 일반적으로, 이항 회귀 모형에서는 로직 모형이나 프로빗 모형이 주로 사용된다. t-링크 함수는 t 분포가 자유도가 커질수록 정규분포로 근사하기 때문에 프로빗 모형을 대신 더 많은 유연성을 위해 사용될 수 있다. 게다가 마코프 회귀모형은 종단 자료에 대해 사용될 수 있다. 우리는 마코프 회귀 모형의 시차를 결정하기 위해 베이지안 방법을 제시하고자 한다. 특히, 각 모델의 차수에 대해 알고 있는 경우에는 DIC를 기준으로 모델 비교를 실시하였다. 모델의 차수에 대해 모르는 경우에는 가능한 모델들의 사후 확률을 이용하였다. 복잡한 베이지안 계산을 해결하기 위하여 Albert와 Chib (1993), Kuo와 Mallick (1998)과 Erkanli 등 (2001)의 방법을 이용하여 모델을 재설정하였다. 제안하는 방법은 시뮬레이션 데이터와 Somer 등 (1984)에 의해 조사된 인도네시아 어린이 종단 데이터에 적용했다. 마코프 이항 회귀모형의 순서에 대해서 아는 경우와 모르는 경우를 각각 가정하여 최적의 모델을 알아보기 위해 MCMC 방법을 사용하였다. 또한, 매트로폴리스 해스팅 알고리즘의 수렴성을 점검하기 위해 Gelman과 Rubin의 진단을 이용했다.

In this paper, we present the longitudinal Markov binary regression model with t-link function when its transition order is known or unknown. It is assumed that logit or probit models are considered in binary regression models. Here, t-link function can be used for more flexibility instead of the probit model since the t distribution approaches to normal distribution as the degree of freedom goes to infinity. A Markov regression model is considered because of the longitudinal data of each individual data set. We propose Bayesian method to determine the transition order of Markov regression model. In particular, we use the deviance information criterion (DIC) (Spiegelhalter et al., 2002) of possible models in order to determine the transition order of the Markov binary regression model if the transition order is known; however, we compute and compare their posterior probabilities if unknown. In order to overcome the complicated Bayesian computation, our proposed model is reconstructed by the ideas of Albert and Chib (1993), Kuo and Mallick (1998), and Erkanli et al. (2001). Our proposed method is applied to the simulated data and real data examined by Sommer et al. (1984). Markov chain Monte Carlo methods to determine the optimal model are used assuming that the transition order of the Markov regression model are known or unknown. Gelman and Rubin's method (1992) is also employed to check the convergence of the Metropolis Hastings algorithm.

키워드

참고문헌

  1. Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and Polychotomous response data, Journal of the American Statistical Association, 88, 669-679. https://doi.org/10.1080/01621459.1993.10476321
  2. Azzalini, A. (1982). Approximate filtering of parameter driven processes, Journal of Time Series Analysis, 3, 219-223. https://doi.org/10.1111/j.1467-9892.1982.tb00344.x
  3. Bartholomew, D. J. (1983). Some recent developments in social statistics, International Statistical Review, 51, 1-9. https://doi.org/10.2307/1402728
  4. Brooks, S. P. and Gelman, A. (1997). General methods for monitoring convergence of iterative simulations, Journal of Computational and Graphical Statistics, 7, 434-455. https://doi.org/10.2307/1390675
  5. Cox, D. R. (1970). The Analysis of Binary Data, Methuen, London.
  6. Cox, D. R. (1981). Statistical analysis of time series: some recent developments, Scandinavian Journal of Statistics, 8, 93-115.
  7. Erkanli, A., Soyer R., and Angold A. (2001). Bayesian analyses of longitudinal binary data using Markov regression models of unknown order, Statistics in Medicine, 20, 755-770. https://doi.org/10.1002/sim.702
  8. Fisher, R. A. (1925). Applications of " Student's" distribution, Metron, 5, 90-104.
  9. Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-511. https://doi.org/10.1214/ss/1177011136
  10. George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling, Journal of the American Statistical Association, 88, 881-889. https://doi.org/10.1080/01621459.1993.10476353
  11. George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection, Statistica Sinica, 7, 339-373.
  12. Gosset, W. S. (1908). The probable error of a mean, Biometrika, 6, 1-25. https://doi.org/10.2307/2331554
  13. Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika, 82, 711-732. https://doi.org/10.1093/biomet/82.4.711
  14. Kalbfleisch, J. D. and Lawless, J. F. (1985). The analysis of panel data under a Markov assumption, Journal of the American Statistical Association, 80, 863-871. https://doi.org/10.1080/01621459.1985.10478195
  15. Korn, E. L. and Whittemore, A. S. (1979). Methods for analyzing panel studies of acute health effects of air pollution, Biometrics, 35, 795-802. https://doi.org/10.2307/2530111
  16. Kuo, L. and Mallick, B. (1998). Variable selection for regression models, Sankhya: The Indian Journal of Statistics, B 60, 65-81.
  17. Lee, T. C., Judge, G. G., and Zellner, A. (1968). Maximum likelihood and Bayesian estimation of transition probabilities, Journal of the American Statistical Association, 63, 1162-1179. https://doi.org/10.1080/01621459.1968.10480918
  18. Lee, T. C., Judge, G. G., and Zellner, A. (1970). Estimating the Parameters of the Markov Probability Model from Aggregate Time Series Data, North-Holland and Pub. Co., Amsterdam.
  19. Meshkani, M. (1978). Empirical Bayes estimation of transition probabilities for Markov chains (Ph.D. Dissertation), Florida State University.
  20. Singer, B. and Spilerman, S. (1976a). The Representation of Social Processes by Markov Models, American Journal of Sociology, 82, 1-54. https://doi.org/10.1086/226269
  21. Singer, B. and Spilerman, S. (1976b). Some Methodological Issues in the Analysis of Longitudinal Surveys, Annals of Economic and Sociological Measurement, 5, 447-474.
  22. Sommer, A., Katz, J. and Tarwotjo, I. (1984). Increased risk of respiratory infection and diarrhea in children with pre-existing mild vitamin A deficiency, American Journal of Clinical Nutrition, 40, 1090-1095. https://doi.org/10.1093/ajcn/40.5.1090
  23. Spiegelhalter, D. A., Best, N. G., Carlin, B. P., and Linde, A. V. (2002). Bayesian measures of model complexity and fit, Journal of the Royal Statistical Society: Series B, 64, 583-639. https://doi.org/10.1111/1467-9868.00353
  24. Tanner, T. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation, Journal of the American Statistical Association, 82, 528-549. https://doi.org/10.1080/01621459.1987.10478458
  25. Wasserman, S. (1980). Analyzing social networks as stochastic processes, Journal of the American Statistical Association, 75, 280-294. https://doi.org/10.1080/01621459.1980.10477465
  26. Zeger, S. L. and Qaqish, B. (1988). Markov regression models for time series: a quasi-likelihood approach, Biometrics, 44, 1019-1031. https://doi.org/10.2307/2531732