• Title/Summary/Keyword: Gibbs Sampling method

Search Result 80, Processing Time 0.025 seconds

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

Bayesian Test of Quasi-Independence in a Sparse Two-Way Contingency Table

  • Kwak, Sang-Gyu;Kim, Dal-Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.495-500
    • /
    • 2012
  • We consider a Bayesian test of independence in a two-way contingency table that has some zero cells. To do this, we take a three-stage hierarchical Bayesian model under each hypothesis. For prior, we use Dirichlet density to model the marginal cell and each cell probabilities. Our method does not require complicated computation such as a Metropolis-Hastings algorithm to draw samples from each posterior density of parameters. We draw samples using a Gibbs sampler with a grid method. For complicated posterior formulas, we apply the Monte-Carlo integration and the sampling important resampling algorithm. We compare the values of the Bayes factor with the results of a chi-square test and the likelihood ratio test.

Uncertainty decomposition in climate-change impact assessments: a Bayesian perspective

  • Ohn, Ilsang;Seo, Seung Beom;Kim, Seonghyeon;Kim, Young-Oh;Kim, Yongdai
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.1
    • /
    • pp.109-128
    • /
    • 2020
  • A climate-impact projection usually consists of several stages, and the uncertainty of the projection is known to be quite large. It is necessary to assess how much each stage contributed to the uncertainty. We call an uncertainty quantification method in which relative contribution of each stage can be evaluated as uncertainty decomposition. We propose a new Bayesian model for uncertainty decomposition in climate change impact assessments. The proposed Bayesian model can incorporate uncertainty of natural variability and utilize data in control period. We provide a simple and efficient Gibbs sampling algorithm using the auxiliary variable technique. We compare the proposed method with other existing uncertainty decomposition methods by analyzing streamflow data for Yongdam Dam basin located at Geum River in South Korea.

A Bayesian Approach to Detecting Outliers Using Variance-Inflation Model

  • Lee, Sangjeen;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.805-814
    • /
    • 2001
  • The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern in the statistical structure to experimenters and data analysts. We propose a model for outliers problem and also analyze it in linear regression model using a Bayesian approach with the variance-inflation model. We will use Geweke's(1996) ideas which is based on the data augmentation method for detecting outliers in linear regression model. The advantage of the proposed method is to find a subset of data which is most suspicious in the given model by the posterior probability The sampling based approach can be used to allow the complicated Bayesian computation. Finally, our proposed methodology is applied to a simulated and a real data.

  • PDF

Bayesian Analysis of Software Reliability Growth Model with Negative Binomial Information (음이항분포 정보를 가진 베이지안 소프트웨어 신뢰도 성장모형에 관한 연구)

  • Kim, Hui-Cheol;Park, Jong-Gu;Lee, Byeong-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.3
    • /
    • pp.852-861
    • /
    • 2000
  • Software reliability growth models are used in testing stages of software development to model the error content and time intervals betwewn software failures. In this paper, using priors for the number of fault with the negative binomial distribution nd the error rate with gamma distribution, Bayesian inference and model selection method for Jelinski-Moranda and Goel-Okumoto and Schick-Wolverton models in software reliability. For model selection, we explored the sum of the relative error, Braun statistic and median variation. In Bayesian computation process, we could avoid the multiple integration by the use of Gibbs sampling, which is a kind of Markov Chain Monte Carolo method to compute the posterior distribution. Using simulated data, Bayesian inference and model selection is studied.

  • PDF

The NHPP Bayesian Software Reliability Model Using Latent Variables (잠재변수를 이용한 NHPP 베이지안 소프트웨어 신뢰성 모형에 관한 연구)

  • Kim, Hee-Cheul;Shin, Hyun-Cheul
    • Convergence Security Journal
    • /
    • v.6 no.3
    • /
    • pp.117-126
    • /
    • 2006
  • Bayesian inference and model selection method for software reliability growth models are studied. Software reliability growth models are used in testing stages of software development to model the error content and time intervals between software failures. In this paper, could avoid multiple integration using Gibbs sampling, which is a kind of Markov Chain Monte Carlo method to compute the posterior distribution. Bayesian inference for general order statistics models in software reliability with diffuse prior information and model selection method are studied. For model determination and selection, explored goodness of fit (the error sum of squares), trend tests. The methodology developed in this paper is exemplified with a software reliability random data set introduced by of Weibull distribution(shape 2 & scale 5) of Minitab (version 14) statistical package.

  • PDF

Bayesian quantile regression analysis of private education expenses for high scool students in Korea (일반계 고등학생 사교육비 지출에 대한 베이지안 분위회귀모형 분석)

  • Oh, Hyun Sook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1457-1469
    • /
    • 2017
  • Private education expenses is one of the key issues in Korea and there have been many discussions about it. Academically, most of previous researches for private education expenses have used multiple regression linear model based on ordinary least squares (OLS) method. However, if the data do not satisfy the basic assumptions of the OLS method such as the normality and homoscedasticity, there is a problem with the reliability of estimations of parameters. In this case, quantile regression model is preferred to OLS model since it does not depend on the assumptions of nonnormality and heteroscedasticity for the data. In the present study, the data from a survey on private education expenses, conducted by Statistics Korea in 2015 has been analyzed for investigation of the impacting factors for private education expenses. Since the data do not satisfy the OLS assumptions, quantile regression model has been employed in Bayesian approach by using gibbs sampling method. The analysis results show that the gender of the student, parent's age, and the time and cost of participating after school are not significant. Household income is positively significant in proportion to the same size for all levels (quantiles) of private education expenses. Spending on private education in Seoul is higher than other regions and the regional difference grows as private education expenditure increases. Total time for private education and student's achievement have positive effect on the lower quantiles than the higher quantiles. Education level of father is positively significant for midium-high quantiles only, but education level of mother is for all but low quantiles. Participating after school is positively significant for the lower quantiles but EBS textbook cost is positively significant for the higher quantiles.

Bayesian Analysis and Mapping of Elderly Korean Suicide Rates (베이지안 모형을 활용한 국내 노인 자살률 질병지도)

  • Lee, Jayoun;Kim, Dal Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.325-334
    • /
    • 2015
  • Elderly suicide rates tend to be high in Korea. Suicide by the elderly is no longer a personal problem; consequently, further research on risk and regional factors is necessary. Disease mapping in epidemiology estimates spatial patterns for disease risk over a geographical region. In this study, we use a simultaneous conditional autoregressive model for spatial correlations between neighboring areas to estimate standard mortality ratios and mapping. The method is illustrated with cause of death data from 2006 and 2010 to analyze regional patterns of elderly suicide in Korea. By considering spatial correlations, the Bayesian spatial models, mean educational attainment and percentage of the elderly who live alone was the significant regional characteristic for elderly suicide. Gibbs sampling and grid method are used for computation.

Development of a Recursive Multinomial Probit Model and its Possible Application for Innovation Studies

  • Jeong, Gicheol
    • STI Policy Review
    • /
    • v.2 no.2
    • /
    • pp.45-54
    • /
    • 2011
  • This paper develops a recursive multinomial probit model and describes its estimation method. The recursive multinomial probit model is an extension of a recursive bivariate probit model. The main difference between the two models is that a single decision among two or more alternatives can be considered in each choice equation in the proposed model. The recursive multinomial probit model is developed based on a standard framework of the multinomial probit model and a Bayesian approach with a Gibbs sampling is adopted for the estimation. The simulation exercise with artificial data sets is showed that the model performed well. Since the recursive multinomial probit model can be applied to analyze the causal relationship between discrete dependent variables with more than two outcomes, the model can play an important role in extending the methodology of the causal relationship analysis in innovation research.

A Study of Anomaly Detection Method Using Bayesian Network (베이지안 네트워크를 이용한 비정상행위 탐지 기법 연구)

  • Cheong, Il-An;Kim, Min-Soo;Noh, Bong-Nam
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10b
    • /
    • pp.1061-1064
    • /
    • 2001
  • 일반적으로 비정상행위를 탐지하는데 통계적인 기법을 사용하여 왔다. 본 논문에서는 통계적인 기법의 단점을 보완하기 위해 베이지안 네트워크(Bayesian Network)의 장점들을 이용한 비정상행위에 대한 판정 및 분석에 효과적인 방법을 연구하고자 한다. 리눅스 시스템의 감사자료(LSM audit data)로부터 사용자의 정상행위에 대해 베이지안 네트워크 학습에 효율적인 Sparse Candidate 알고리즘을 사용하고, 감사자료의 일부가 결여되어 있는 경우에도 추론이 가능하도록 Gibbs Sampling 방법을 적용하여 시스템 사용자의 비정상행위를 판정하는데 도움이 되도록 한다.

  • PDF