• 제목/요약/키워드: Markov chain Monte Carlo

검색결과 270건 처리시간 0.022초

Sparse Data Cleaning using Multiple Imputations

  • Jun, Sung-Hae;Lee, Seung-Joo;Oh, Kyung-Whan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제4권1호
    • /
    • pp.119-124
    • /
    • 2004
  • Real data as web log file tend to be incomplete. But we have to find useful knowledge from these for optimal decision. In web log data, many useful things which are hyperlink information and web usages of connected users may be found. The size of web data is too huge to use for effective knowledge discovery. To make matters worse, they are very sparse. We overcome this sparse problem using Markov Chain Monte Carlo method as multiple imputations. This missing value imputation changes spare web data to complete. Our study may be a useful tool for discovering knowledge from data set with sparseness. The more sparseness of data in increased, the better performance of MCMC imputation is good. We verified our work by experiments using UCI machine learning repository data.

Bayesian pooling for contingency tables from small areas

  • Jo, Aejung;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권6호
    • /
    • pp.1621-1629
    • /
    • 2016
  • This paper studies Bayesian pooling for analysis of categorical data from small areas. Many surveys consist of categorical data collected on a contingency table in each area. Statistical inference for small areas requires considerable care because the subpopulation sample sizes are usually very small. Typically we use the hierarchical Bayesian model for pooling subpopulation data. However, the customary hierarchical Bayesian models may specify more exchangeability than warranted. We, therefore, investigate the effects of pooling in hierarchical Bayesian modeling for the contingency table from small areas. In specific, this paper focuses on the methods of direct or indirect pooling of categorical data collected on a contingency table in each area through Dirichlet priors. We compare the pooling effects of hierarchical Bayesian models by fitting the simulated data. The analysis is carried out using Markov chain Monte Carlo methods.

Bayesian Model for Cost Estimation of Construction Projects

  • Kim, Sang-Yon
    • 한국건축시공학회지
    • /
    • 제11권1호
    • /
    • pp.91-99
    • /
    • 2011
  • Bayesian network is a form of probabilistic graphical model. It incorporates human reasoning to deal with sparse data availability and to determine the probabilities of uncertain cases. In this research, bayesian network is adopted to model the problem of construction project cost. General information, time, cost, and material, the four main factors dominating the characteristic of construction costs, are incorporated into the model. This research presents verify a model that were conducted to illustrate the functionality and application of a decision support system for predicting the costs. The Markov Chain Monte Carlo (MCMC) method is applied to estimate parameter distributions. Furthermore, it is shown that not all the parameters are normally distributed. In addition, cost estimates based on the Gibbs output is performed. It can enhance the decision the decision-making process.

Bayesian analysis of financial volatilities addressing long-memory, conditional heteroscedasticity and skewed error distribution

  • Oh, Rosy;Shin, Dong Wan;Oh, Man-Suk
    • Communications for Statistical Applications and Methods
    • /
    • 제24권5호
    • /
    • pp.507-518
    • /
    • 2017
  • Volatility plays a crucial role in theory and applications of asset pricing, optimal portfolio allocation, and risk management. This paper proposes a combined model of autoregressive moving average (ARFIMA), generalized autoregressive conditional heteroscedasticity (GRACH), and skewed-t error distribution to accommodate important features of volatility data; long memory, heteroscedasticity, and asymmetric error distribution. A fully Bayesian approach is proposed to estimate the parameters of the model simultaneously, which yields parameter estimates satisfying necessary constraints in the model. The approach can be easily implemented using a free and user-friendly software JAGS to generate Markov chain Monte Carlo samples from the joint posterior distribution of the parameters. The method is illustrated by using a daily volatility index from Chicago Board Options Exchange (CBOE). JAGS codes for model specification is provided in the Appendix.

Methods and Techniques for Variance Component Estimation in Animal Breeding - Review -

  • Lee, C.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제13권3호
    • /
    • pp.413-422
    • /
    • 2000
  • In the class of models which include random effects, the variance component estimates are important to obtain accurate predictors and estimators. Variance component estimation is straightforward for balanced data but not for unbalanced data. Since orthogonality among factors is absent in unbalanced data, various methods for variance component estimation are available. REML estimation is the most widely used method in animal breeding because of its attractive statistical properties. Recently, Bayesian approach became feasible through Markov Chain Monte Carlo methods with increasingly powerful computers. Furthermore, advances in variance component estimation with complicated models such as generalized linear mixed models enabled animal breeders to analyze non-normal data.

Bayesian Analysis for Heat Effects on Mortality

  • Jo, Young-In;Lim, Youn-Hee;Kim, Ho;Lee, Jae-Yong
    • Communications for Statistical Applications and Methods
    • /
    • 제19권5호
    • /
    • pp.705-720
    • /
    • 2012
  • In this paper, we introduce a hierarchical Bayesian model to simultaneously estimate the thresholds of each 6 cities. It was noted in the literature there was a dramatic increases in the number of deaths if the mean temperature passes a certain value (that we call a threshold). We estimate the difference of mortality before and after the threshold. For the hierarchical Bayesian analysis, some proper prior distribution of parameters and hyper-parameters are assumed. By combining the Gibbs and Metropolis-Hastings algorithm, we constructed a Markov chain Monte Carlo algorithm and the posterior inference was based on the posterior sample. The analysis shows that the estimates of the threshold are located at $25^{\circ}C{\sim}29^{\circ}C$ and the mortality around the threshold changes from -1% to 2~13%.

ASSESSING POPULATION BIOEQUIVALENCE IN A $2{\times}2$ CROSSOVER DESIGN WITH CARRYOVER EFFECT IN A BAYESIAN PERSPECTIVE

  • Oh Hyun-Sook
    • Journal of the Korean Statistical Society
    • /
    • 제35권3호
    • /
    • pp.239-250
    • /
    • 2006
  • A $2{\times}2$ crossover design including carryover effect is considered for assessment of population bioequivalence of two drug formulations in a Bayesian framework. In classical analysis, it is complex to deal with the carryover effect since the estimate of the drug effect is biased in the presence of a carryover effect. The proposed method in this article uses uninformative priors and vague proper priors for objectiveness of priors and the posterior probability distribution of the parameters of interest is derived with given priors. The posterior probabilities of the hypotheses for assessing population bioequivalence are evaluated based on a Markov chain Monte Carlo simulation method. An example with real data set is given for illustration.

Efficient Markov Chain Monte Carlo for Bayesian Analysis of Neural Network Models

  • Paul E. Green;Changha Hwang;Lee, Sangbock
    • Journal of the Korean Statistical Society
    • /
    • 제31권1호
    • /
    • pp.63-75
    • /
    • 2002
  • Most attempts at Bayesian analysis of neural networks involve hierarchical modeling. We believe that similar results can be obtained with simpler models that require less computational effort, as long as appropriate restrictions are placed on parameters in order to ensure propriety of posterior distributions. In particular, we adopt a model first introduced by Lee (1999) that utilizes an improper prior for all parameters. Straightforward Gibbs sampling is possible, with the exception of the bias parameters, which are embedded in nonlinear sigmoidal functions. In addition to the problems posed by nonlinearity, direct sampling from the posterior distributions of the bias parameters is compounded due to the duplication of hidden nodes, which is a source of multimodality. In this regard, we focus on sampling from the marginal posterior distribution of the bias parameters with Markov chain Monte Carlo methods that combine traditional Metropolis sampling with a slice sampler described by Neal (1997, 2001). The methods are illustrated with data examples that are largely confined to the analysis of nonparametric regression models.

스프링 최적설계를 위한 피로수명 파라미터의 역 추정 (Inverse Estimation of Fatigue Life Parameters for Spring Design Optimization)

  • 김완범;안다운;최주호
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2011년도 정기 학술대회
    • /
    • pp.345-348
    • /
    • 2011
  • 구조요소의 설계에서 유한요소해석은 매우 효과적인 방법이다. 이 방법은 시험 수행에 드는 시간과 비용을 줄여준다. 그러나 공정 과정과 환경에 의하여 생기는 입력 물성치들의 변화 때문에 우리는 유한요소해석의 결과를 전적으로 믿어서는 안 된다. 따라서 유한요소해석의 신뢰성을 증명하는 것은 매우 중요하다. 본 연구에서는 현장에 축적된 피로 수명 시험 데이터를 바탕으로 유한요소해석을 이용하여 피로수명 파라미터를 역 추정 하는 연구를 수행하였다. 베이지안 접근법을 이용하여 불확실성 피로 수명 파라미터의 사후분포를 구하였고, 마코프체인몬테카를로(Markov Chain Monte Carlo) 기법을 이용하여 역 추정된 파라미터의 샘플 데이터를 생성하였다. 얻어진 샘플 데이터를 기반으로 새로운 형상의 스프링에 대한 피로 수명을 예측한다. 신뢰성 기반 형상 최적화(RBDO)는 서스펜션 코일 스프링의 요구수명을 만족시키기 위하여 수행된다. 또한 크리깅 근사 모델은 유한요소해석의 연산 량 감소를 위해 이용한다.

  • PDF

밭의 비옥도를 고려한 품종실험 분석 (Modelling Heterogeneity in Fertility for Analysis of Variety Trials)

  • 윤성철;강위창;이영조;임용빈
    • 응용통계연구
    • /
    • 제11권2호
    • /
    • pp.423-433
    • /
    • 1998
  • 농사실험에서 품종실험자료를 분석할 때, 난괴법(Completely Randomized Block Design) 모형을 많이 이용하고 있다. 이 모형에서는 각 블록내의 모든 실험단위들에서 비옥도가 같다고 가정한다. 그러나 많은 경우에 각 블록내 실험단위들의 비옥도에 규칙적인 이질성이 존재한다. 이러한 이질성을 고려하기 위하여, 본 논문에서는 다단계 일반화 선형모형(Hierarchical Generalized Linear Models)을 이용하여 품종효과와 블록내의 비옥도 효과를 함께 모형화 하고, 이 모형으로 Scottish Agricultural Colleges의 목록에 실려 있는 자료를 분석하여, 마코프체인 몬테칼로(Markov Chain Monte Carlo)방법으로 분석한 결과와 비교해 본다.

  • PDF