Browse > Article
http://dx.doi.org/10.29220/CSAM.2020.27.2.255

A fast approximate fitting for mixture of multivariate skew t-distribution via EM algorithm  

Kim, Seung-Gu (Department of Data and Information, Sangji University)
Publication Information
Communications for Statistical Applications and Methods / v.27, no.2, 2020 , pp. 255-268 More about this Journal
Abstract
A mixture of multivariate canonical fundamental skew t-distribution (CFUST) has been of interest in various fields. In particular, interest in the unsupervised learning society is noteworthy. However, fitting the model via EM algorithm suffers from significant processing time. The main cause is due to the calculation of many multivariate t-cdfs (cumulative distribution functions) in E-step. In this article, we provide an approximate, but fast calculation method for the in univariate fashion, which is the product of successively conditional univariate t-cdfs with Taylor's first order approximation. By replacing all multivariate t-cdfs in E-step with the proposed approximate versions, we obtain the admissible results of fitting the model, where it gives 85% reduction time for the 5 dimensional skewness case of the Australian Institution Sport data set. For this approach, discussions about rough properties, advantages and limits are also presented.
Keywords
mixture model; CFUST; approximate t-cdf; EM algorithm;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Arellano-Valle RB and Genton MG (2005). On fundamental skew distributions, Journal of Multivariate Analysis, 96, 93-116   DOI
2 Cook RD and Weisberg S (1994). An Introduction to Regression Graphics, John Wiley & Sons, New York.
3 Genz A and Bretz F (2009). Computation of multivariate normal and t probabilities, Lecture Notes in Statistics, 195, Springer-Verlag, Heidelberg.
4 Ho HJ, Lin TI, Chen HY, andWang WL (2012). Some results on the truncated multivariate t distribution, Journal of Multivariate Analysis, 96, 93-116.   DOI
5 Kim SG (2016). An approximation fitting for mixture of multivariate skew normal distribution via EM algorithm, Korean Journal of Applied Statistics, 29, 513-523.   DOI
6 Lee SX and McLachlan GJ (2016a). Finite mixtures of canonical fundamental skew t-distributions: The unification of the unrestricted and restricted skew t-mixture models, Statistics and Computing, 26, 573-586.   DOI
7 Lee SX and McLachlan GJ (2016b). A simple parallel EM algorithm for statistical learning via mixture models. arXiv:1606.02054 [stat.CO] 7 Jun 2016.
8 Lin TI (2010). Robust mixture modelling using multivariate skew t distribution, Statistics and Computing, 20, 343-356.   DOI
9 Lin TI, Wang WL, McLachlan GJ, and Lee SX (2018). Robust mixtures of factor analysis models using the restricted multivariate skew-t distribution, Statistical Modelling, 18, 50-72.   DOI
10 Pyne S, Hu X, Wang K, et al. (2009). Automated high-dimensional flow cytometric data analysis. In Proceedings of the National Academy of Sciences of the United States of America, 106, 8519-8524.   DOI
11 Zogheib B and Elsaheli A (2015). Approximations of the t distribution, Applied Mathematical Sciences, 9, 2445-2449.   DOI