Browse > Article
http://dx.doi.org/10.5351/KJAS.2022.35.1.131

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data  

Rhee, Eun Hee (Department of Applied Statistics, Chung-Ang University)
Hwang, Beom Seuk (Department of Applied Statistics, Chung-Ang University)
Publication Information
The Korean Journal of Applied Statistics / v.35, no.1, 2022 , pp. 131-146 More about this Journal
Abstract
Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.
Keywords
Bayesian inference; Community Health Survey; classification; logistic regression model; Markov chain Monte Carlo;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Albert JH and Chib S (1993). Bayes inference via Gibbs sampling of autoregressive time series subject to Markov mean and variance shifts, Journal of Business and Economic Statistics, 11, 1-15.   DOI
2 Chib S and Greenberg E (1995). Understanding the metropolis-hastings algorithm, The American Statistician, 49, 327-335.   DOI
3 Chib S, Greenberg E, and Winkelmann R (1998). Posterior simulation and Bayes factors in panel count data models, Journal of Econometrics, 86, 33-54.   DOI
4 Chib S, Nardari F, and Shephard N (2002). Markov chain Monte Carlo methods for stochastic volatility models, Journal of Econometrics, 108, 281-316.   DOI
5 Gamerman D (1997). Sampling from the posterior distribution in generalized linear mixed models. Statistics and Computing, 7(1), 57-68.   DOI
6 Held L and Holmes CC (2006). Bayesian auxiliary variable models for binary and multinomial regression, Bayesian Analysis, 1, 145-168.   DOI
7 Kim S, Shephard N, and Chib S (1998). Stochastic volatility: likelihood inference and comparison with ARCH models, The review of economic studies, 65.3, 361-393.   DOI
8 King G and Zeng L (2001). Logistic regression in rare events data, Political analysis, 9, 137-163.   DOI
9 Scott SL (2011). Data augmentation, frequentist estimation, and the Bayesian analysis of multinomial logit models, Statistical Papers, 52, 87-109.   DOI
10 Titterington DM, Afm S, Smith AF, and Makov UE (1985). Statistical Analysis of Finite Mixture Distributions (Vol. 198), John Wiley and Sons Incorporated.
11 Zellner A and Rossi PE (1984). Bayesian analysis of dichotomous quantal response models, Journal of Econometrics, 25, 365-393.   DOI
12 Kim YM, Cho DG, and Kang SH (2014). An empirical analysis on geographic variations in the prevalence of diabetes, Health and Social Welfare Review, 34, 82-105.   DOI
13 Kim SB and Hwang BS (2019). A Bayesian skewed logit model for high-risk drinking data, The Korean Data and Information Science Society, 30, 335-348.   DOI
14 Gelman A, Gilks WR, and Roberts GO (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms, The Annals of Applied Probability, 7, 110-120.   DOI
15 Nanayakkara N, Andrea JC, Stephane H, et al. (2020). Impact of age at type 2 diabetes mellitus diagnosis on mortality and vascular complications: systematic review and meta-analyses, Diabetologia, 64.2, 275-287.   DOI
16 Chen MH, Dey DK, and Shao QM (1999). A new skewed link model for dichotomous quantal response data, Journal of the American Statistical Association, 94, 1172-1186.   DOI
17 Fruhwirth-Schnatter S and Fruhwirth R (2007). Auxiliary mixture sampling with applications to logistic models,Computational Statistics and Data Analysis, 51.7, 3509-3528.   DOI
18 Fruhwirth-Schnatter S, Fruhwirth R, Held L, and Rue H (2009). Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data, Statistics and Computing, 19, 479-492.   DOI
19 Geweke J and Keane M (1999). Mixture of normals probit models, In honour of: Hsiao C, Pesaran MH, Lahiri KL, Lee LF (Eds.), Analysis of Panels and Limited Dependent Variable Models(pp. 49-78), Cambridge University Press, Cambridge.
20 International Diabetes Federation (2019). IDF Diabetes Atlas(9th ed.), retrieved from: https://www.diabetesatlas.org
21 Kim YH and Hwang BS (2020). Joint analysis of binary and continuous data using skewed logit model in developmental toxicity studies, The Korean Journal of Applied Statistics, 33, 123-136.   DOI
22 R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
23 Lenk PJ and DeSarbo WS (2000). Bayesian inference for finite mixtures of generalized linear models with random effects, Psychometrika, 65, 93-119.   DOI
24 McFadden D (1973). Conditional logit analysis of qualitative choice behavior, Frontiers in Econometrics, Academic Press, New York, 105-142.
25 Omori Y, Chib S, Shephard N, and Nakajima J (2007). Stochastic volatility with leverage: Fast and efficient likelihood inference, Journal of Econometrics, 140, 425-449.   DOI
26 Shephard N (1994). Partial non-Gaussian state space, Biometrika, 81, 115-131   DOI
27 Song KE, Kim DJ, Park JW, Cho HK, Lee KW, and Huh KB (2007). Clinical characteristics of Korean type 2 diabetic patients according to insulin secretion and insulin resistance, Diabetes and Metabolism Journal, 31, 123-129.
28 Theodoridis S (2015). Machine learning: A Bayesian and Optimization Perspective, Academic press.
29 World Health Organization Regional Office for the Western Pacific (2000). The Asia-Pacific perspective : redefining obesity and its treatment, Sydney : Health Communications Australia, retrieved from: https://apps.who.int/iris/handle/10665/206936