• 제목/요약/키워드: Dirichlet Mixture Model

Search Result 19, Processing Time 0.023 seconds

Curve Clustering in Microarray

  • Lee, Kyeong-Eun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.3
    • /
    • pp.575-584
    • /
    • 2004
  • We propose a Bayesian model-based approach using a mixture of Dirichlet processes model with discrete wavelet transform, for curve clustering in the microarray data with time-course gene expressions.

  • PDF

Dirichlet Process Mixtures of Linear Mixed Regressions

  • Kyung, Minjung
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.6
    • /
    • pp.625-637
    • /
    • 2015
  • We develop a Bayesian clustering procedure based on a Dirichlet process prior with cluster specific random effects. Gibbs sampling of a normal mixture of linear mixed regressions with a Dirichlet process was implemented to calculate posterior probabilities when the number of clusters was unknown. Our approach (unlike its counterparts) provides simultaneous partitioning and parameter estimation with the computation of the classification probabilities. A Monte Carlo study of curve estimation results showed that the model was useful for function estimation. We find that the proposed Dirichlet process mixture model with cluster specific random effects detects clusters sensitively by combining vague edges into different clusters. Examples are given to show how these models perform on real data.

An Application of Dirichlet Mixture Model for Failure Time Density Estimation to Components of Naval Combat System (디리슈레 혼합모형을 이용한 함정 전투체계 부품의 고장시간 분포 추정)

  • Lee, Jinwhan;Kim, Jung Hun;Jung, BongJoo;Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.42 no.4
    • /
    • pp.194-202
    • /
    • 2019
  • Reliability analysis of the components frequently starts with the data that manufacturer provides. If enough failure data are collected from the field operations, the reliability should be recomputed and updated on the basis of the field failure data. However, when the failure time record for a component contains only a few observations, all statistical methodologies are limited. In this case, where the failure records for multiple number of identical components are available, a valid alternative is combining all the data from each component into one data set with enough sample size and utilizing the useful information in the censored data. The ROK Navy has been operating multiple Patrol Killer Guided missiles (PKGs) for several years. The Korea Multi-Function Control Console (KMFCC) is one of key components in PKG combat system. The maintenance record for the KMFCC contains less than ten failure observations and a censored datum. This paper proposes a Bayesian approach with a Dirichlet mixture model to estimate failure time density for KMFCC. Trends test for each component record indicated that null hypothesis, that failure occurrence is renewal process, is not rejected. Since the KMFCCs have been functioning under different operating environment, the failure time distribution may be a composition of a number of unknown distributions, i.e. a mixture distribution, rather than a single distribution. The Dirichlet mixture model was coded as probabilistic programming in Python using PyMC3. Then Markov Chain Monte Carlo (MCMC) sampling technique employed in PyMC3 probabilistically estimated the parameters' posterior distribution through the Dirichlet mixture model. The simulation results revealed that the mixture models provide superior fits to the combined data set over single models.

Bayesian Curve Clustering in Microarray

  • Lee, Kyeong-Eun;Mallick, Bani K.
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.39-42
    • /
    • 2006
  • We propose a Bayesian model-based approach using a mixture of Dirichlet processes model with discrete wavelet transform, for curve clustering in the microarray data with time-course gene expressions.

  • PDF

Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

  • Jeon, Hyung-Bae;Lee, Soo-Young
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.487-493
    • /
    • 2016
  • Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-estimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.

A nonparametric Bayesian seemingly unrelated regression model (비모수 베이지안 겉보기 무관 회귀모형)

  • Jo, Seongil;Seok, Inhae;Choi, Taeryon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.627-641
    • /
    • 2016
  • In this paper, we consider a seemingly unrelated regression (SUR) model and propose a nonparametric Bayesian approach to SUR with a Dirichlet process mixture of normals for modeling an unknown error distribution. Posterior distributions are derived based on the proposed model, and the posterior inference is performed via Markov chain Monte Carlo methods based on the collapsed Gibbs sampler of a Dirichlet process mixture model. We present a simulation study to assess the performance of the model. We also apply the model to precipitation data over South Korea.

Bayesian analysis of random partition models with Laplace distribution

  • Kyung, Minjung
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.5
    • /
    • pp.457-480
    • /
    • 2017
  • We develop a random partition procedure based on a Dirichlet process prior with Laplace distribution. Gibbs sampling of a Laplace mixture of linear mixed regressions with a Dirichlet process is implemented as a random partition model when the number of clusters is unknown. Our approach provides simultaneous partitioning and parameter estimation with the computation of classification probabilities, unlike its counterparts. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo posterior computation. The proposed method is illustrated with simulated data and one real data of the energy efficiency of Tsanas and Xifara (Energy and Buildings, 49, 560-567, 2012).

Nonparametric Bayesian methods: a gentle introduction and overview

  • MacEachern, Steven N.
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.6
    • /
    • pp.445-466
    • /
    • 2016
  • Nonparametric Bayesian methods have seen rapid and sustained growth over the past 25 years. We present a gentle introduction to the methods, motivating the methods through the twin perspectives of consistency and false consistency. We then step through the various constructions of the Dirichlet process, outline a number of the basic properties of this process and move on to the mixture of Dirichlet processes model, including a quick discussion of the computational methods used to fit the model. We touch on the main philosophies for nonparametric Bayesian data analysis and then reanalyze a famous data set. The reanalysis illustrates the concept of admissibility through a novel perturbation of the problem and data, showing the benefit of shrinkage estimation and the much greater benefit of nonparametric Bayesian modelling. We conclude with a too-brief survey of fancier nonparametric Bayesian methods.

Online nonparametric Bayesian analysis of parsimonious Gaussian mixture models and scenes clustering

  • Zhou, Ri-Gui;Wang, Wei
    • ETRI Journal
    • /
    • v.43 no.1
    • /
    • pp.74-81
    • /
    • 2021
  • The mixture model is a very powerful and flexible tool in clustering analysis. Based on the Dirichlet process and parsimonious Gaussian distribution, we propose a new nonparametric mixture framework for solving challenging clustering problems. Meanwhile, the inference of the model depends on the efficient online variational Bayesian approach, which enhances the information exchange between the whole and the part to a certain extent and applies to scalable datasets. The experiments on the scene database indicate that the novel clustering framework, when combined with a convolutional neural network for feature extraction, has meaningful advantages over other models.

Nonparametric Bayesian Multiple Change Point Problems

  • Kim, Chansoo;Younshik Chung
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.1
    • /
    • pp.1-16
    • /
    • 2002
  • Since changepoint identification is important in many data analysis problem, we wish to make inference about the locations of one or more changepoints of the sequence. We consider the Bayesian nonparameteric inference for multiple changepoint problem using a Bayesian segmentation procedure proposed by Yang and Kuo (2000). A mixture of products of Dirichlet process is used as a prior distribution. To decide whether there exists a single change or not, our approach depends on nonparametric Bayesian Schwartz information criterion at each step. We discuss how to choose the precision parameter (total mass parameter) in nonparametric setting and show that the discreteness of the Dirichlet process prior can ha17e a large effect on the nonparametric Bayesian Schwartz information criterion and leads to conclusions that are very different results from reasonable parametric model. One example is proposed to show this effect.