• Title/Summary/Keyword: Markov chain monte carlo

Search Result 271, Processing Time 0.024 seconds

Seasonal rainfall short-term forecasting model considering climate indices (외부기상인자를 고려한 낙동강유역 계절강수량 단기예측모형)

  • Lee, Jeong-Ju;Kwon, Hyun-Han;Hwang, Kyu-Nam;Chun, Si-Young
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.401-401
    • /
    • 2011
  • 본 연구는 Bayesian MCMC(Markov Chain Monte Carlo)를 이용한 비정상성 빈도해석 모형에 외부기상인자를 결합하여 계절단위의 강수량을 예측하는데 목적을 두고 있으며, 그 중에서도 홍수 위험도와 관련하여 유용하게 이용될 수 있는 여름강수량을 예측 대상으로 하였다. 비정상성 빈도해석 모형을 기반으로 외부 기상인자에 의한 변동성을 고려하기 위해서는 대상 수문량을 한정할 필요가 있으며 극대치강수량과 연관성이 높은 장마전선, 태풍 등의 기상인자는 공간적 변동성 및 복합적인 특성들로 인해 예측인자를 구성하는 기상인자로 사용하기에는 무리가 있다. 따라서 본 연구에서는 계절단위의 수문량으로 여름강수량을 대상으로 하였으며, 이에 영향을 미치는 외부 기상인자로서 SST(sea surface temperature)와 OLR(outgoing longwave radiation)을 도입하였으며, 낙동강유역 여름강수량과의 공간 상관성이 높은 지역의 이전 겨울 SST와 6월 OLR을 예측인자로 활용한 7~9월 여름강수량 예측모형을 구성하였다. 모형의 검증은 결과를 알고 있는 2010년 여름 강수량을 대상으로 수행하였으며, 모형의 적용은 현재시점에서 관측된 2010년 겨울 SST와, 과거 관측 자료를 토대로 가정된 2011년 6월 OLR을 이용하여 2011년 여름 강수량을 예측하였다. 결과적으로 모형 매개변수들의 사후분포로부터 불확실성 구간을 포함한 예측결과를 구할 수 있었다.

  • PDF

Scaling Documents' Semantic Transparency Spectrum with Semantic Hypernetwork (Semantic Hypernetwork 학습에 의한 자연언어 텍스트의 의미 구분)

  • Lee, Eun-Seok;Kim, Joon-Shik;Shin, Won-Jin;Park, Chan-Hoon;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.289-294
    • /
    • 2008
  • 어떤 자연언어 문서가 전달하려는 의미는 그 텍스트의 성격에 따라 아주 명확할 수도(예: 뉴스 문서), 아주 불분명할 수도 있다(예: 시). 이 연구는 이러한 '의미의 명확성(semantic transparency)'을 정량적으로 측정할 수 있다고 가정하고, 이 의미의 명확성을 판단하는 데에 단어들의 연쇄(word association)의 확률통계적 성질들이 어떻게 기능하는지에 대해 논한다. 이를 위해 특정 단어가 연쇄체를 형성하면서 발생하는 neighboring frequency와 degeneracy를 중심으로 Markov chain Monte Carlo scheme을 적용하여 의미망('Semantic Hypernetwork')으로 학습시킨 후 문서의 구성 단어들과 그 집합들 간의 연결 상태를 파악하였다. 우리는 의미적으로 그 표상이 분명하게 나뉘는 문서들(뉴스와 시)을 대상으로 이 모델이 어떻게 이들의 의미적 명확성을 분류하는지 분석하였다. Neighboring frequency와 degeneracy, 이 두 속성이 언어구조에서의 의미망 기억과 학습 탐색 기제에 유의한 기질로서 제안될 수 있다. 본 연구의 주요 결과로 1) 텍스트의 의미론적 투명성을 구별하는 통계적 증거와, 2) 문서의 의미구조에 대한 새로운 기질 발견, 3) 기존의 문서의 카테고리 별 분류와는 다른 방식의 분류 방식 제안을 들 수 있다.

  • PDF

The inference and estimation for latent discrete outcomes with a small sample

  • Choi, Hyung;Chung, Hwan
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.2
    • /
    • pp.131-146
    • /
    • 2016
  • In research on behavioral studies, significant attention has been paid to the stage-sequential process for longitudinal data. Latent class profile analysis (LCPA) is an useful method to study sequential patterns of the behavioral development by the two-step identification process: identifying a small number of latent classes at each measurement occasion and two or more homogeneous subgroups in which individuals exhibit a similar sequence of latent class membership over time. Maximum likelihood (ML) estimates for LCPA are easily obtained by expectation-maximization (EM) algorithm, and Bayesian inference can be implemented via Markov chain Monte Carlo (MCMC). However, unusual properties in the likelihood of LCPA can cause difficulties in ML and Bayesian inference as well as estimation in small samples. This article describes and addresses erratic problems that involve conventional ML and Bayesian estimates for LCPA with small samples. We argue that these problems can be alleviated with a small amount of prior input. This study evaluates the performance of likelihood and MCMC-based estimates with the proposed prior in drawing inference over repeated sampling. Our simulation shows that estimates from the proposed methods perform better than those from the conventional ML and Bayesian method.

Bayesian Variable Selection in Linear Regression Models with Inequality Constraints on the Coefficients (제한조건이 있는 선형회귀 모형에서의 베이지안 변수선택)

  • 오만숙
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.73-84
    • /
    • 2002
  • Linear regression models with inequality constraints on the coefficients are frequently used in economic models due to sign or order constraints on the coefficients. In this paper, we propose a Bayesian approach to selecting significant explanatory variables in linear regression models with inequality constraints on the coefficients. Bayesian variable selection requires computation of posterior probability of each candidate model. We propose a method which computes all the necessary posterior model probabilities simultaneously. In specific, we obtain posterior samples form the most general model via Gibbs sampling algorithm (Gelfand and Smith, 1990) and compute the posterior probabilities by using the samples. A real example is given to illustrate the method.

Bayesian Variable Selection in the Proportional Hazard Model with Application to Microarray Data

  • Lee, Kyeong-Eun;Mallick, Bani K.
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.05a
    • /
    • pp.17-23
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions(covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enables us to estimate the survival curve when n ${\ll}$p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA (cDNA) data and Breast Carcinomas data.

  • PDF

A Bayesian cure rate model with dispersion induced by discrete frailty

  • Cancho, Vicente G.;Zavaleta, Katherine E.C.;Macera, Marcia A.C.;Suzuki, Adriano K.;Louzada, Francisco
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.471-488
    • /
    • 2018
  • In this paper, we propose extending proportional hazards frailty models to allow a discrete distribution for the frailty variable. Having zero frailty can be interpreted as being immune or cured. Thus, we develop a new survival model induced by discrete frailty with zero-inflated power series distribution, which can account for overdispersion. This proposal also allows for a realistic description of non-risk individuals, since individuals cured due to intrinsic factors (immunes) are modeled by a deterministic fraction of zero-risk while those cured due to an intervention are modeled by a random fraction. We put the proposed model in a Bayesian framework and use a Markov chain Monte Carlo algorithm for the computation of posterior distribution. A simulation study is conducted to assess the proposed model and the computation algorithm. We also discuss model selection based on pseudo-Bayes factors as well as developing case influence diagnostics for the joint posterior distribution through ${\psi}-divergence$ measures. The motivating cutaneous melanoma data is analyzed for illustration purposes.

Cure rate proportional odds models with spatial frailties for interval-censored data

  • Yiqi, Bao;Cancho, Vicente Garibay;Louzada, Francisco;Suzuki, Adriano Kamimura
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.605-625
    • /
    • 2017
  • This paper presents proportional odds cure models to allow spatial correlations by including spatial frailty in the interval censored data setting. Parametric cure rate models with independent and dependent spatial frailties are proposed and compared. Our approach enables different underlying activation mechanisms that lead to the event of interest; in addition, the number of competing causes which may be responsible for the occurrence of the event of interest follows a Geometric distribution. Markov chain Monte Carlo method is used in a Bayesian framework for inferential purposes. For model comparison some Bayesian criteria were used. An influence diagnostic analysis was conducted to detect possible influential or extreme observations that may cause distortions on the results of the analysis. Finally, the proposed models are applied for the analysis of a real data set on smoking cessation. The results of the application show that the parametric cure model with frailties under the first activation scheme has better findings.

Bayesian analysis of directional conditionally autoregressive models (방향성 공간적 조건부 자기회귀 모형의 베이즈 분석 방법)

  • Kyung, Minjung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1133-1146
    • /
    • 2016
  • Counts or averages over arbitrary regions are often analyzed using conditionally autoregressive (CAR) models. The spatial neighborhoods within CAR model are generally formed using only the inter-distance or boundaries between the sub-regions. Kyung and Ghosh (2009) proposed a new class of models to accommodate spatial variations that may depend on directions, using different weights given to neighbors in different directions. The proposed model, directional conditionally autoregressive (DCAR) model, generalized the usual CAR model by accounting for spatial anisotropy. Bayesian inference method is discussed based on efficient Markov chain Monte Carlo (MCMC) sampling of the posterior distributions of the parameters. The method is illustrated using a data set of median property prices across Greater Glasgow, Scotland, in 2008.

A nonparametric Bayesian seemingly unrelated regression model (비모수 베이지안 겉보기 무관 회귀모형)

  • Jo, Seongil;Seok, Inhae;Choi, Taeryon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.627-641
    • /
    • 2016
  • In this paper, we consider a seemingly unrelated regression (SUR) model and propose a nonparametric Bayesian approach to SUR with a Dirichlet process mixture of normals for modeling an unknown error distribution. Posterior distributions are derived based on the proposed model, and the posterior inference is performed via Markov chain Monte Carlo methods based on the collapsed Gibbs sampler of a Dirichlet process mixture model. We present a simulation study to assess the performance of the model. We also apply the model to precipitation data over South Korea.

Use of Lèvy distribution to analyze longitudinal data with asymmetric distribution and presence of left censored data

  • Achcar, Jorge A.;Coelho-Barros, Emilio A.;Cuevas, Jose Rafael Tovar;Mazucheli, Josmar
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.43-60
    • /
    • 2018
  • This paper considers the use of classical and Bayesian inference methods to analyze data generated by variables whose natural behavior can be modeled using asymmetric distributions in the presence of left censoring. Our approach used a $L{\grave{e}}vy$ distribution in the presence of left censored data and covariates. This distribution could be a good alternative to model data with asymmetric behavior in many applications as lifetime data for instance, especially in engineering applications and health research, when some observations are large in comparison to other ones and standard distributions commonly used to model asymmetry data like the exponential, Weibull or log-logistic are not appropriate to be fitted by the data. Inferences for the parameters of the proposed model under a classical inference approach are obtained using a maximum likelihood estimators (MLEs) approach and usual asymptotical normality for MLEs based on the Fisher information measure. Under a Bayesian approach, the posterior summaries of interest are obtained using standard Markov chain Monte Carlo simulation methods and available software like SAS. A numerical illustration is presented considering data of thyroglobulin levels present in a group of individuals with differentiated cancer of thyroid.