• 제목/요약/키워드: normal mixture distribution model

검색결과 28건 처리시간 0.02초

Estimating Suitable Probability Distribution Function for Multimodal Traffic Distribution Function

  • Yoo, Sang-Lok;Jeong, Jae-Yong;Yim, Jeong-Bin
    • 해양환경안전학회지
    • /
    • 제21권3호
    • /
    • pp.253-258
    • /
    • 2015
  • The purpose of this study is to find suitable probability distribution function of complex distribution data like multimodal. Normal distribution is broadly used to assume probability distribution function. However, complex distribution data like multimodal are very hard to be estimated by using normal distribution function only, and there might be errors when other distribution functions including normal distribution function are used. In this study, we experimented to find fit probability distribution function in multimodal area, by using AIS(Automatic Identification System) observation data gathered in Mokpo port for a year of 2013. By using chi-squared statistic, gaussian mixture model(GMM) is the fittest model rather than other distribution functions, such as extreme value, generalized extreme value, logistic, and normal distribution. GMM was found to the fit model regard to multimodal data of maritime traffic flow distribution. Probability density function for collision probability and traffic flow distribution will be calculated much precisely in the future.

정규분포기반 두각 혼합모형의 순환적 적합을 이용한 군집분석에서의 변수선택 (Variable Selection in Clustering by Recursive Fit of Normal Distribution-based Salient Mixture Model)

  • 김승구
    • 응용통계연구
    • /
    • 제26권5호
    • /
    • pp.821-834
    • /
    • 2013
  • Law 등 (2004)은 군집분석에서 변수선택을 위해 정규분포기반 "두각 혼합모형(salient mixture model)"의 사용을 제안하였다. 본 논문에서는 이 모형의 적합 상의 문제점과 변수선택의 결함을 지적하고 그 대안을 제시한다. 모의자료와 실자료를 바탕으로 제안된 방법이 기존의 방법보다 유용함을 보였다.

Reject Inference of Incomplete Data Using a Normal Mixture Model

  • Song, Ju-Won
    • 응용통계연구
    • /
    • 제24권2호
    • /
    • pp.425-433
    • /
    • 2011
  • Reject inference in credit scoring is a statistical approach to adjust for nonrandom sample bias due to rejected applicants. Function estimation approaches are based on the assumption that rejected applicants are not necessary to be included in the estimation, when the missing data mechanism is missing at random. On the other hand, the density estimation approach by using mixture models indicates that reject inference should include rejected applicants in the model. When mixture models are chosen for reject inference, it is often assumed that data follow a normal distribution. If data include missing values, an application of the normal mixture model to fully observed cases may cause another sample bias due to missing values. We extend reject inference by a multivariate normal mixture model to handle incomplete characteristic variables. A simulation study shows that inclusion of incomplete characteristic variables outperforms the function estimation approaches.

Application of Finite Mixture to Characterise Degraded Gmelina arborea Roxb Plantation in Omo Forest Reserve, Nigeria

  • Ogana, Friday Nwabueze
    • Journal of Forest and Environmental Science
    • /
    • 제34권6호
    • /
    • pp.451-456
    • /
    • 2018
  • The use of single component distribution to describe the irregular stand structure of degraded forest often lead to bias. Such biasness can be overcome by the application of finite mixture distribution. Therefore, in this study, finite mixture distribution was used to characterise the irregular stand structure of the Gmelina arborea plantation in Omo forest reserve. Thirty plots, ten each from the three stands established in 1984, 1990 and 2005 were used. The data were pooled per stand and fitted. Four finite mixture distributions including normal mixture, lognormal mixture, gamma mixture and Weibull mixture were considered. The method of maximum likelihood was used to fit the finite mixture distributions to the data. Model assessment was based on negative loglikelihood value ($-{\Lambda}{\Lambda}$), Akaike information criterion (AIC), Bayesian information criterion (BIC) and root mean square error (RMSE). The results showed that the mixture distributions provide accurate and precise characterisation of the irregular diameter distribution of the degraded Gmelina arborea stands. The $-{\Lambda}{\Lambda}$, AIC, BIC and RMSE values ranged from -715.233 to -348.375, 703.926 to 1433.588, 718.598 to 1451.334 and 3.003 to 7.492, respectively. Their performances were relatively the same. This approach can be used to describe other irregular forest stand structures, especially the multi-species forest.

A Predictive Two-Group Multinormal Classification Rule Accounting for Model Uncertainty

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • 제26권4호
    • /
    • pp.477-491
    • /
    • 1997
  • A new predictive classification rule for assigning future cases into one of two multivariate normal population (with unknown normal mixture model) is considered. The development involves calculation of posterior probability of each possible normal-mixture model via a default Bayesian test criterion, called intrinsic Bayes factor, and suggests predictive distribution for future cases to be classified that accounts for model uncertainty by weighting the effect of each model by its posterior probabiliy. In this paper, our interest is focused on constructing the classification rule that takes care of uncertainty about the types of covariance matrices (homogeneity/heterogeneity) involved in the model. For the constructed rule, a Monte Carlo simulation study demonstrates routine application and notes benefits over traditional predictive calssification rule by Geisser (1982).

  • PDF

EM 알고리즘에 의한 다변량 치우친 정규분포 혼합모형의 근사적 적합 (An approximate fitting for mixture of multivariate skew normal distribution via EM algorithm)

  • 김승구
    • 응용통계연구
    • /
    • 제29권3호
    • /
    • pp.513-523
    • /
    • 2016
  • 다중 치우침 모수벡터를 가진 다변량 치우친 정규분포 (MSNMix)를 EM 알고리즘으로 적합하려면 E-step에서 다변량 절단 정규분포의 적률과 확률을 계산해야 하는데 이것은 매우 큰 계산 시간을 요구한다. 그래서 비대칭 자료를 적합하는데 흔히 단순 치우침 모수를 가진 모형을 적용한다. 이 모형은 단변량 처리방식으로 적합하는 것이 가능하기 때문에 처리속도가 매우 빠르다. 그러나 단순 치우침 모수를 적용하는 것은 응용에서 비현실적인 경우가 많다. 본 논문에서는 다중 치우침 모수를 가지는 MSNMix의 근사적 추정법을 제안하는데, 이 방법은 단변량 처리방식이 적용되므로 향상된 처리속도를 보장한다. 그리고 제안된 방법의 실효성을 보이기 위해 몇 가지 실험 결과를 제공한다.

Modeling Circular Data with Uniformly Dispersed Noise

  • Yu, Hye-Kyung;Jun, Kyoung-Ho;Na, Jong-Hwa
    • 응용통계연구
    • /
    • 제25권4호
    • /
    • pp.651-659
    • /
    • 2012
  • In this paper we developed a statistical model for circular data with noises. In this case, model fitting by single circular model has a lack-of-fit problem. To overcome this problem, we consider some mixture models that include circular uniform distribution and apply an EM algorithm to estimate the parameters. Both von Mises and Wrapped skew normal distributions are considered in this paper. Simulation studies are executed to assess the suggested EM algorithms. Finally, we applied the suggested method to fit 2008 EHFRS(Epidemic Hemorrhagic Fever with Renal Syndrome) data provided by the KCDC(Korea Centers for Disease Control and Prevention).

Effects on Regression Estimates under Misspecified Generalized Linear Mixed Models for Counts Data

  • Jeong, Kwang Mo
    • 응용통계연구
    • /
    • 제25권6호
    • /
    • pp.1037-1047
    • /
    • 2012
  • The generalized linear mixed model(GLMM) is widely used in fitting categorical responses of clustered data. In the numerical approximation of likelihood function the normality is assumed for the random effects distribution; subsequently, the commercial statistical packages also routinely fit GLMM under this normality assumption. We may also encounter departures from the distributional assumption on the response variable. It would be interesting to investigate the impact on the estimates of parameters under misspecification of distributions; however, there has been limited researche on these topics. We study the sensitivity or robustness of the maximum likelihood estimators(MLEs) of GLMM for counts data when the true underlying distribution is normal, gamma, exponential, and a mixture of two normal distributions. We also consider the effects on the MLEs when we fit Poisson-normal GLMM whereas the outcomes are generated from the negative binomial distribution with overdispersion. Through a small scale Monte Carlo study we check the empirical coverage probabilities of parameters and biases of MLEs of GLMM.

A Bayesian Variable Selection Method for Binary Response Probit Regression

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • 제28권2호
    • /
    • pp.167-182
    • /
    • 1999
  • This article is concerned with the selection of subsets of predictor variables to be included in building the binary response probit regression model. It is based on a Bayesian approach, intended to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure reformulates the probit regression setup in a hierarchical normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. The appropriate posterior probability of each subset of predictor variables is obtained through the Gibbs sampler, which samples indirectly from the multinomial posterior distribution on the set of possible subset choices. Thus, in this procedure, the most promising subset of predictors can be identified as the one with highest posterior probability. To highlight the merit of this procedure a couple of illustrative numerical examples are given.

  • PDF

가우시안 혼합 모델에 대한 EM 알고리즘을 이용한 신호와 잡음의 분리 (Separating Signals and Noises Using EM Algorithm for Gaussian Mixture Model)

  • 유시원;유한민;이혜선;전치혁
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 2007년도 추계학술대회 및 정기총회
    • /
    • pp.469-473
    • /
    • 2007
  • For the quantitative analysis of inclusion using OES data, separating of noise and inclusion is needed. In previous methods assuming that noises come from a normal distribution, intensity levels beyond a specific threshold are determined as inclusions. However, it is not possible to classify inclusions in low intensity region using this method, even though every inclusion is an element of some chemical compound. In this paper, we assume that distribution of OES data is a Gaussian mixture and estimate the parameters of the mixture model using EM algorithm. Then, we calculate mixing ratio of noise and inclusion using these parameters to separate noise and inclusion.

  • PDF