• Title/Summary/Keyword: Dirichlet Process

Search Result 72, Processing Time 0.024 seconds

Identifying differentially expressed genes using the Polya urn scheme

  • Saraiva, Erlandson Ferreira;Suzuki, Adriano Kamimura;Milan, Luis Aparecido
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.627-640
    • /
    • 2017
  • A common interest in gene expression data analysis is to identify genes that present significant changes in expression levels among biological experimental conditions. In this paper, we develop a Bayesian approach to make a gene-by-gene comparison in the case with a control and more than one treatment experimental condition. The proposed approach is within a Bayesian framework with a Dirichlet process prior. The comparison procedure is based on a model selection procedure developed using the discreteness of the Dirichlet process and its representation via Polya urn scheme. The posterior probabilities for models considered are calculated using a Gibbs sampling algorithm. A numerical simulation study is conducted to understand and compare the performance of the proposed method in relation to usual methods based on analysis of variance (ANOVA) followed by a Tukey test. The comparison among methods is made in terms of a true positive rate and false discovery rate. We find that proposed method outperforms the other methods based on ANOVA followed by a Tukey test. We also apply the methodologies to a publicly available data set on Plasmodium falciparum protein.

A nonparametric Bayesian seemingly unrelated regression model (비모수 베이지안 겉보기 무관 회귀모형)

  • Jo, Seongil;Seok, Inhae;Choi, Taeryon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.627-641
    • /
    • 2016
  • In this paper, we consider a seemingly unrelated regression (SUR) model and propose a nonparametric Bayesian approach to SUR with a Dirichlet process mixture of normals for modeling an unknown error distribution. Posterior distributions are derived based on the proposed model, and the posterior inference is performed via Markov chain Monte Carlo methods based on the collapsed Gibbs sampler of a Dirichlet process mixture model. We present a simulation study to assess the performance of the model. We also apply the model to precipitation data over South Korea.

Nonparametric Bayesian Estimation for the Exponential Lifetime Data under the Type II Censoring

  • Lee, Woo-Dong;Kim, Dal-Ho;Kang, Sang-Gil
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.2
    • /
    • pp.417-426
    • /
    • 2001
  • This paper addresses the nonparametric Bayesian estimation for the exponential populations under type II censoring. The Dirichlet process prior is used to provide nonparametric Bayesian estimates of parameters of exponential populations. In the past, there have been computational difficulties with nonparametric Bayesian problems. This paper solves these difficulties by a Gibbs sampler algorithm. This procedure is applied to a real example and is compared with a classical estimator.

  • PDF

Semiparametric Bayesian multiple comparisons for Poisson Populations

  • Cho, Jang Sik;Kim, Dal Ho;Kang, Sang Gil
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.2
    • /
    • pp.427-434
    • /
    • 2001
  • In this paper, we consider the nonparametric Bayesian approach to the multiple comparisons problem for I Poisson populations using Dirichlet process priors. We describe Gibbs sampling algorithm for calculating posterior probabilities for the hypotheses and calculate posterior probabilities for the hypotheses using Markov chain Monte Carlo. Also we provide a numerical example to illustrate the developed numerical technique.

  • PDF

An Application of Dirichlet Mixture Model for Failure Time Density Estimation to Components of Naval Combat System (디리슈레 혼합모형을 이용한 함정 전투체계 부품의 고장시간 분포 추정)

  • Lee, Jinwhan;Kim, Jung Hun;Jung, BongJoo;Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.42 no.4
    • /
    • pp.194-202
    • /
    • 2019
  • Reliability analysis of the components frequently starts with the data that manufacturer provides. If enough failure data are collected from the field operations, the reliability should be recomputed and updated on the basis of the field failure data. However, when the failure time record for a component contains only a few observations, all statistical methodologies are limited. In this case, where the failure records for multiple number of identical components are available, a valid alternative is combining all the data from each component into one data set with enough sample size and utilizing the useful information in the censored data. The ROK Navy has been operating multiple Patrol Killer Guided missiles (PKGs) for several years. The Korea Multi-Function Control Console (KMFCC) is one of key components in PKG combat system. The maintenance record for the KMFCC contains less than ten failure observations and a censored datum. This paper proposes a Bayesian approach with a Dirichlet mixture model to estimate failure time density for KMFCC. Trends test for each component record indicated that null hypothesis, that failure occurrence is renewal process, is not rejected. Since the KMFCCs have been functioning under different operating environment, the failure time distribution may be a composition of a number of unknown distributions, i.e. a mixture distribution, rather than a single distribution. The Dirichlet mixture model was coded as probabilistic programming in Python using PyMC3. Then Markov Chain Monte Carlo (MCMC) sampling technique employed in PyMC3 probabilistically estimated the parameters' posterior distribution through the Dirichlet mixture model. The simulation results revealed that the mixture models provide superior fits to the combined data set over single models.

EMPIRICAL BAYES ESTIMATION OF RESIDUAL SURVIVAL FUNCTION AT AGE

  • Liang, Ta-Chen
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.2
    • /
    • pp.191-202
    • /
    • 2004
  • The paper considers nonparametric empirical Bayes estimation of residual survival function at age t using a Dirichlet process prior V(a). Empirical Bayes estimators are proposed for the case where both the function ${\alpha}$(0, $\chi$] and the size a(R$\^$+/) are unknown. It is shown that the proposed empirical Bayes estimators are asymptotically optimal at a rate n$\^$-1/, where n is the number of past data available for the present estimation problem. Therefore, the result of Lahiri and Park (1988) in which a(R$\^$+/) is assumed to be known and a rate n$\^$-1/ is achieved, is extended to a(R$\^$+/) unknown case.

A Semantic Aspect-Based Vector Space Model to Identify the Event Evolution Relationship within Topics

  • Xi, Yaoyi;Li, Bicheng;Liu, Yang
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.2
    • /
    • pp.73-82
    • /
    • 2015
  • Understanding how the topic evolves is an important and challenging task. A topic usually consists of multiple related events, and the accurate identification of event evolution relationship plays an important role in topic evolution analysis. Existing research has used the traditional vector space model to represent the event, which cannot be used to accurately compute the semantic similarity between events. This has led to poor performance in identifying event evolution relationship. This paper suggests constructing a semantic aspect-based vector space model to represent the event: First, use hierarchical Dirichlet process to mine the semantic aspects. Then, construct a semantic aspect-based vector space model according to these aspects. Finally, represent each event as a point and measure the semantic relatedness between events in the space. According to our evaluation experiments, the performance of our proposed technique is promising and significantly outperforms the baseline methods.

Bayesian Multiple Comparisons for Normal Variances

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.29 no.2
    • /
    • pp.155-168
    • /
    • 2000
  • Regarding to multiple comparison problem (MCP) of k normal population variances, we suggest a Bayesian method for calculating posterior probabilities for various hypotheses of equality among population variances. This leads to a simple method for obtaining pairwise comparisons of variances in a statistical experiment with a partition on the parameter space induced by equality and inequality relationships among the variances. The method is derived from the fact that certain features of the hierarchical nonparametric family of Dirichlet process priors, in general, make it amenable to solving the MCP and estimating the posterior probabilities by means of posterior simulation, the Gibbs sampling. Two examples are illustrated for the method. For these examples, the method is straightforward for specifying distributionally and to implement computationally, with output readily adapted for required comparison.

  • PDF

A Comparison Study of Bayesian Methods for a Threshold Autoregressive Model with Regime-Switching (국면전환 임계 자기회귀 분석을 위한 베이지안 방법 비교연구)

  • Roh, Taeyoung;Jo, Seongil;Lee, Ryounghwa
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.1049-1068
    • /
    • 2014
  • Autoregressive models are used to analyze an univariate time series data; however, these methods can be inappropriate when a structural break appears in a time series since they assume that a trend is consistent. Threshold autoregressive models (popular regime-switching models) have been proposed to address this problem. Recently, the models have been extended to two regime-switching models with delay parameter. We discuss two regime-switching threshold autoregressive models from a Bayesian point of view. For a Bayesian analysis, we consider a parametric threshold autoregressive model and a nonparametric threshold autoregressive model using Dirichlet process prior. The posterior distributions are derived and the posterior inferences is performed via Markov chain Monte Carlo method and based on two Bayesian threshold autoregressive models. We present a simulation study to compare the performance of the models. We also apply models to gross domestic product data of U.S.A and South Korea.

Comparison and Analysis of Subject Classification for Domestic Research Data (국내 학술논문 주제 분류 알고리즘 비교 및 분석)

  • Choi, Wonjun;Sul, Jaewook;Jeong, Heeseok;Yoon, Hwamook
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.178-186
    • /
    • 2018
  • Subject classification of thesis units is essential to serve scholarly information deliverables. However, to date, there is a journal-based topic classification, and there are not many article-level subject classification services. In the case of academic papers among domestic works, subject classification can be a more important information because it can cover a larger area of service and can provide service by setting a range. However, the problem of classifying themes by field requires the hands of experts in various fields, and various methods of verification are needed to increase accuracy. In this paper, we try to classify topics using the unsupervised learning algorithm to find the correct answer in the unknown state and compare the results of the subject classification algorithms using the coherence and perplexity. The unsupervised learning algorithms are a well-known Hierarchical Dirichlet Process (HDP), Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) algorithm.