• Title/Summary/Keyword: normal mixture distribution model

Search Result 28, Processing Time 0.024 seconds

Estimating Suitable Probability Distribution Function for Multimodal Traffic Distribution Function

  • Yoo, Sang-Lok;Jeong, Jae-Yong;Yim, Jeong-Bin
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.21 no.3
    • /
    • pp.253-258
    • /
    • 2015
  • The purpose of this study is to find suitable probability distribution function of complex distribution data like multimodal. Normal distribution is broadly used to assume probability distribution function. However, complex distribution data like multimodal are very hard to be estimated by using normal distribution function only, and there might be errors when other distribution functions including normal distribution function are used. In this study, we experimented to find fit probability distribution function in multimodal area, by using AIS(Automatic Identification System) observation data gathered in Mokpo port for a year of 2013. By using chi-squared statistic, gaussian mixture model(GMM) is the fittest model rather than other distribution functions, such as extreme value, generalized extreme value, logistic, and normal distribution. GMM was found to the fit model regard to multimodal data of maritime traffic flow distribution. Probability density function for collision probability and traffic flow distribution will be calculated much precisely in the future.

Variable Selection in Clustering by Recursive Fit of Normal Distribution-based Salient Mixture Model (정규분포기반 두각 혼합모형의 순환적 적합을 이용한 군집분석에서의 변수선택)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.5
    • /
    • pp.821-834
    • /
    • 2013
  • Law et al. (2004) proposed a normal distribution based salient mixture model for variable selection in clustering. However, this model has substantial problems such as the unidentifiability of components an the inaccurate selection of informative variables in the case of a small cluster size. We propose an alternative method to overcome problems and demonstrate a good performance through experiments on simulated data and real data.

Reject Inference of Incomplete Data Using a Normal Mixture Model

  • Song, Ju-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.425-433
    • /
    • 2011
  • Reject inference in credit scoring is a statistical approach to adjust for nonrandom sample bias due to rejected applicants. Function estimation approaches are based on the assumption that rejected applicants are not necessary to be included in the estimation, when the missing data mechanism is missing at random. On the other hand, the density estimation approach by using mixture models indicates that reject inference should include rejected applicants in the model. When mixture models are chosen for reject inference, it is often assumed that data follow a normal distribution. If data include missing values, an application of the normal mixture model to fully observed cases may cause another sample bias due to missing values. We extend reject inference by a multivariate normal mixture model to handle incomplete characteristic variables. A simulation study shows that inclusion of incomplete characteristic variables outperforms the function estimation approaches.

Application of Finite Mixture to Characterise Degraded Gmelina arborea Roxb Plantation in Omo Forest Reserve, Nigeria

  • Ogana, Friday Nwabueze
    • Journal of Forest and Environmental Science
    • /
    • v.34 no.6
    • /
    • pp.451-456
    • /
    • 2018
  • The use of single component distribution to describe the irregular stand structure of degraded forest often lead to bias. Such biasness can be overcome by the application of finite mixture distribution. Therefore, in this study, finite mixture distribution was used to characterise the irregular stand structure of the Gmelina arborea plantation in Omo forest reserve. Thirty plots, ten each from the three stands established in 1984, 1990 and 2005 were used. The data were pooled per stand and fitted. Four finite mixture distributions including normal mixture, lognormal mixture, gamma mixture and Weibull mixture were considered. The method of maximum likelihood was used to fit the finite mixture distributions to the data. Model assessment was based on negative loglikelihood value ($-{\Lambda}{\Lambda}$), Akaike information criterion (AIC), Bayesian information criterion (BIC) and root mean square error (RMSE). The results showed that the mixture distributions provide accurate and precise characterisation of the irregular diameter distribution of the degraded Gmelina arborea stands. The $-{\Lambda}{\Lambda}$, AIC, BIC and RMSE values ranged from -715.233 to -348.375, 703.926 to 1433.588, 718.598 to 1451.334 and 3.003 to 7.492, respectively. Their performances were relatively the same. This approach can be used to describe other irregular forest stand structures, especially the multi-species forest.

A Predictive Two-Group Multinormal Classification Rule Accounting for Model Uncertainty

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.4
    • /
    • pp.477-491
    • /
    • 1997
  • A new predictive classification rule for assigning future cases into one of two multivariate normal population (with unknown normal mixture model) is considered. The development involves calculation of posterior probability of each possible normal-mixture model via a default Bayesian test criterion, called intrinsic Bayes factor, and suggests predictive distribution for future cases to be classified that accounts for model uncertainty by weighting the effect of each model by its posterior probabiliy. In this paper, our interest is focused on constructing the classification rule that takes care of uncertainty about the types of covariance matrices (homogeneity/heterogeneity) involved in the model. For the constructed rule, a Monte Carlo simulation study demonstrates routine application and notes benefits over traditional predictive calssification rule by Geisser (1982).

  • PDF

An approximate fitting for mixture of multivariate skew normal distribution via EM algorithm (EM 알고리즘에 의한 다변량 치우친 정규분포 혼합모형의 근사적 적합)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.513-523
    • /
    • 2016
  • Fitting a mixture of multivariate skew normal distribution (MSNMix) with multiple skewness parameter vectors via EM algorithm often requires a highly expensive computational cost to calculate the moments and probabilities of multivariate truncated normal distribution in E-step. Subsequently, it is common to fit an asymmetric data set with MSNMix with a simple skewness parameter vector since it allows us to compute them in E-step in an univariate manner that guarantees a cheap computational cost. However, the adaptation of a simple skewness parameter is unrealistic in many situations. This paper proposes an approximate estimation for the MSNMix with multiple skewness parameter vectors that also allows us to treat them in an univariate manner. We additionally provide some experiments to show its effectiveness.

Modeling Circular Data with Uniformly Dispersed Noise

  • Yu, Hye-Kyung;Jun, Kyoung-Ho;Na, Jong-Hwa
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.651-659
    • /
    • 2012
  • In this paper we developed a statistical model for circular data with noises. In this case, model fitting by single circular model has a lack-of-fit problem. To overcome this problem, we consider some mixture models that include circular uniform distribution and apply an EM algorithm to estimate the parameters. Both von Mises and Wrapped skew normal distributions are considered in this paper. Simulation studies are executed to assess the suggested EM algorithms. Finally, we applied the suggested method to fit 2008 EHFRS(Epidemic Hemorrhagic Fever with Renal Syndrome) data provided by the KCDC(Korea Centers for Disease Control and Prevention).

Effects on Regression Estimates under Misspecified Generalized Linear Mixed Models for Counts Data

  • Jeong, Kwang Mo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1037-1047
    • /
    • 2012
  • The generalized linear mixed model(GLMM) is widely used in fitting categorical responses of clustered data. In the numerical approximation of likelihood function the normality is assumed for the random effects distribution; subsequently, the commercial statistical packages also routinely fit GLMM under this normality assumption. We may also encounter departures from the distributional assumption on the response variable. It would be interesting to investigate the impact on the estimates of parameters under misspecification of distributions; however, there has been limited researche on these topics. We study the sensitivity or robustness of the maximum likelihood estimators(MLEs) of GLMM for counts data when the true underlying distribution is normal, gamma, exponential, and a mixture of two normal distributions. We also consider the effects on the MLEs when we fit Poisson-normal GLMM whereas the outcomes are generated from the negative binomial distribution with overdispersion. Through a small scale Monte Carlo study we check the empirical coverage probabilities of parameters and biases of MLEs of GLMM.

A Bayesian Variable Selection Method for Binary Response Probit Regression

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.2
    • /
    • pp.167-182
    • /
    • 1999
  • This article is concerned with the selection of subsets of predictor variables to be included in building the binary response probit regression model. It is based on a Bayesian approach, intended to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure reformulates the probit regression setup in a hierarchical normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. The appropriate posterior probability of each subset of predictor variables is obtained through the Gibbs sampler, which samples indirectly from the multinomial posterior distribution on the set of possible subset choices. Thus, in this procedure, the most promising subset of predictors can be identified as the one with highest posterior probability. To highlight the merit of this procedure a couple of illustrative numerical examples are given.

  • PDF

Separating Signals and Noises Using EM Algorithm for Gaussian Mixture Model (가우시안 혼합 모델에 대한 EM 알고리즘을 이용한 신호와 잡음의 분리)

  • Yu, Si-Won;Yu, Han-Min;Lee, Hye-Seon;Jeon, Chi-Hyeok
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2007.11a
    • /
    • pp.469-473
    • /
    • 2007
  • For the quantitative analysis of inclusion using OES data, separating of noise and inclusion is needed. In previous methods assuming that noises come from a normal distribution, intensity levels beyond a specific threshold are determined as inclusions. However, it is not possible to classify inclusions in low intensity region using this method, even though every inclusion is an element of some chemical compound. In this paper, we assume that distribution of OES data is a Gaussian mixture and estimate the parameters of the mixture model using EM algorithm. Then, we calculate mixing ratio of noise and inclusion using these parameters to separate noise and inclusion.

  • PDF