Bayesian inference of longitudinal Markov binary regression models with t-link function

Sim, Bohyun;Chung, Younshik;

doi:10.5351/KJAS.2020.33.1.047

응용통계연구 (The Korean Journal of Applied Statistics)

제33권1호
/
Pages.47-59
/
2020
/
1225-066X(pISSN)
/
2383-5818(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

t-링크를 갖는 마코프 이항 회귀 모형을 이용한 인도네시아 어린이 종단 자료에 대한 베이지안 분석

Bayesian inference of longitudinal Markov binary regression models with t-link function

심보현 (부산대학교 통계학과) ;
정윤식 (부산대학교 통계학과)

Sim, Bohyun (Department of Statistics, Pusan National University) ;
Chung, Younshik (Department of Statistics, Pusan National University)

투고 : 2019.11.21
심사 : 2020.01.08
발행 : 2020.02.29

https://doi.org/10.5351/KJAS.2020.33.1.047 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 마코프 이항 회귀 모형의 시차가 알려져 있거나 그렇지 않은 경우일 때, t-링크 함수를 갖는 종단적 마코프 이항 회귀 모형을 제시한다. 일반적으로, 이항 회귀 모형에서는 로직 모형이나 프로빗 모형이 주로 사용된다. t-링크 함수는 t 분포가 자유도가 커질수록 정규분포로 근사하기 때문에 프로빗 모형을 대신 더 많은 유연성을 위해 사용될 수 있다. 게다가 마코프 회귀모형은 종단 자료에 대해 사용될 수 있다. 우리는 마코프 회귀 모형의 시차를 결정하기 위해 베이지안 방법을 제시하고자 한다. 특히, 각 모델의 차수에 대해 알고 있는 경우에는 DIC를 기준으로 모델 비교를 실시하였다. 모델의 차수에 대해 모르는 경우에는 가능한 모델들의 사후 확률을 이용하였다. 복잡한 베이지안 계산을 해결하기 위하여 Albert와 Chib (1993), Kuo와 Mallick (1998)과 Erkanli 등 (2001)의 방법을 이용하여 모델을 재설정하였다. 제안하는 방법은 시뮬레이션 데이터와 Somer 등 (1984)에 의해 조사된 인도네시아 어린이 종단 데이터에 적용했다. 마코프 이항 회귀모형의 순서에 대해서 아는 경우와 모르는 경우를 각각 가정하여 최적의 모델을 알아보기 위해 MCMC 방법을 사용하였다. 또한, 매트로폴리스 해스팅 알고리즘의 수렴성을 점검하기 위해 Gelman과 Rubin의 진단을 이용했다.

In this paper, we present the longitudinal Markov binary regression model with t-link function when its transition order is known or unknown. It is assumed that logit or probit models are considered in binary regression models. Here, t-link function can be used for more flexibility instead of the probit model since the t distribution approaches to normal distribution as the degree of freedom goes to infinity. A Markov regression model is considered because of the longitudinal data of each individual data set. We propose Bayesian method to determine the transition order of Markov regression model. In particular, we use the deviance information criterion (DIC) (Spiegelhalter et al., 2002) of possible models in order to determine the transition order of the Markov binary regression model if the transition order is known; however, we compute and compare their posterior probabilities if unknown. In order to overcome the complicated Bayesian computation, our proposed model is reconstructed by the ideas of Albert and Chib (1993), Kuo and Mallick (1998), and Erkanli et al. (2001). Our proposed method is applied to the simulated data and real data examined by Sommer et al. (1984). Markov chain Monte Carlo methods to determine the optimal model are used assuming that the transition order of the Markov regression model are known or unknown. Gelman and Rubin's method (1992) is also employed to check the convergence of the Metropolis Hastings algorithm.

키워드

참고문헌

Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and Polychotomous response data, Journal of the American Statistical Association, 88, 669-679. https://doi.org/10.1080/01621459.1993.10476321
Azzalini, A. (1982). Approximate filtering of parameter driven processes, Journal of Time Series Analysis, 3, 219-223. https://doi.org/10.1111/j.1467-9892.1982.tb00344.x
Bartholomew, D. J. (1983). Some recent developments in social statistics, International Statistical Review, 51, 1-9. https://doi.org/10.2307/1402728
Brooks, S. P. and Gelman, A. (1997). General methods for monitoring convergence of iterative simulations, Journal of Computational and Graphical Statistics, 7, 434-455. https://doi.org/10.2307/1390675
Cox, D. R. (1970). The Analysis of Binary Data, Methuen, London.
Cox, D. R. (1981). Statistical analysis of time series: some recent developments, Scandinavian Journal of Statistics, 8, 93-115.
Erkanli, A., Soyer R., and Angold A. (2001). Bayesian analyses of longitudinal binary data using Markov regression models of unknown order, Statistics in Medicine, 20, 755-770. https://doi.org/10.1002/sim.702
Fisher, R. A. (1925). Applications of " Student's" distribution, Metron, 5, 90-104.
Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-511. https://doi.org/10.1214/ss/1177011136
George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling, Journal of the American Statistical Association, 88, 881-889. https://doi.org/10.1080/01621459.1993.10476353
George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection, Statistica Sinica, 7, 339-373.
Gosset, W. S. (1908). The probable error of a mean, Biometrika, 6, 1-25. https://doi.org/10.2307/2331554
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika, 82, 711-732. https://doi.org/10.1093/biomet/82.4.711
Kalbfleisch, J. D. and Lawless, J. F. (1985). The analysis of panel data under a Markov assumption, Journal of the American Statistical Association, 80, 863-871. https://doi.org/10.1080/01621459.1985.10478195
Korn, E. L. and Whittemore, A. S. (1979). Methods for analyzing panel studies of acute health effects of air pollution, Biometrics, 35, 795-802. https://doi.org/10.2307/2530111
Kuo, L. and Mallick, B. (1998). Variable selection for regression models, Sankhya: The Indian Journal of Statistics, B 60, 65-81.
Lee, T. C., Judge, G. G., and Zellner, A. (1968). Maximum likelihood and Bayesian estimation of transition probabilities, Journal of the American Statistical Association, 63, 1162-1179. https://doi.org/10.1080/01621459.1968.10480918
Lee, T. C., Judge, G. G., and Zellner, A. (1970). Estimating the Parameters of the Markov Probability Model from Aggregate Time Series Data, North-Holland and Pub. Co., Amsterdam.
Meshkani, M. (1978). Empirical Bayes estimation of transition probabilities for Markov chains (Ph.D. Dissertation), Florida State University.
Singer, B. and Spilerman, S. (1976a). The Representation of Social Processes by Markov Models, American Journal of Sociology, 82, 1-54. https://doi.org/10.1086/226269
Singer, B. and Spilerman, S. (1976b). Some Methodological Issues in the Analysis of Longitudinal Surveys, Annals of Economic and Sociological Measurement, 5, 447-474.
Sommer, A., Katz, J. and Tarwotjo, I. (1984). Increased risk of respiratory infection and diarrhea in children with pre-existing mild vitamin A deficiency, American Journal of Clinical Nutrition, 40, 1090-1095. https://doi.org/10.1093/ajcn/40.5.1090
Spiegelhalter, D. A., Best, N. G., Carlin, B. P., and Linde, A. V. (2002). Bayesian measures of model complexity and fit, Journal of the Royal Statistical Society: Series B, 64, 583-639. https://doi.org/10.1111/1467-9868.00353
Tanner, T. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation, Journal of the American Statistical Association, 82, 528-549. https://doi.org/10.1080/01621459.1987.10478458
Wasserman, S. (1980). Analyzing social networks as stochastic processes, Journal of the American Statistical Association, 75, 280-294. https://doi.org/10.1080/01621459.1980.10477465
Zeger, S. L. and Qaqish, B. (1988). Markov regression models for time series: a quasi-likelihood approach, Biometrics, 44, 1019-1031. https://doi.org/10.2307/2531732

응용통계연구 (The Korean Journal of Applied Statistics)

t-링크를 갖는 마코프 이항 회귀 모형을 이용한 인도네시아 어린이 종단 자료에 대한 베이지안 분석

Bayesian inference of longitudinal Markov binary regression models with t-link function

초록

키워드

참고문헌

자세히 찾기