[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5351/CSAM.2017.24.1.097

Model-based inverse regression for mixture data

Choi, Changhwan (Department of Statistics, Sungkyunkwan University)
Park, Chongsun (Department of Statistics, Sungkyunkwan University)

Publication Information

Communications for Statistical Applications and Methods / v.24, no.1, 2017 , pp. 97-113 More about this Journal

Abstract

This paper proposes a method for sufficient dimension reduction (SDR) of mixture data. We consider mixture data containing more than one component that have distinct central subspaces. We adopt an approach of a model-based sliced inverse regression (MSIR) to the mixture data in a simple and intuitive manner. We employed mixture probabilistic principal component analysis (MPPCA) to estimate each central subspaces and cluster the data points. The results from simulation studies and a real data set show that our method is satisfactory to catch appropriate central spaces and is also robust regardless of the number of slices chosen. Discussions about root selection, estimation accuracy, and classification with initial value issues of MPPCA and its related simulation results are also provided.

Keywords

dimension reduction; sliced inverse regression; mixture modeling; principal component analysis; probability model;

Citations & Related Records

Reference

1	Anderson TW (1963). Asymptotic theory for principal component analysis, Annals of Mathematical Statistics, 34, 122-148. DOI
2	Anderson TW and Rubin H (1956). Statistical inference in factor analysis. In Proceedings of 3rd Berkeley Symposium on Mathematical Statistics and Probability, 5, University of California Press, 111-150.
3	Chen J, Li P, and Fu Y (2012). Inference on the order of a normal mixture, Journal of the American Statistical Association, 107, 1096-1105. DOI
4	Cook RD (1994). Using dimension-reduction subspaces to identify important inputs in models of physical systems. In Proceedings of the Section on Physical and Engineering Sciences (pp. 18-25), American Statistical Association, Alexandria, VA.
5	Cook RD (1998). Regression Graphics: Ideas for Studying Regressions through Graphics, JohnWiley & Sons, New York.
6	Cook RD and Weisberg S (1991). Comment on sliced inverse regression by K. C. Li, Journal of the American Statistical Association, 86, 328-332.
7	Cook RD and Weisberg S (1994). An Introduction to Regression Graphics, John Wiley & Sons, New York.
8	Gentle JE (2007). Matrix Algebra: Theory, Computations, and Applications in Statistics, Springer, New York.
9	Jeffries NO (2003). A note on 'Testing the number of components in a normal mixture', Biometrika, 90, 991-994. DOI
10	Li KC (1992). On principal Hessian directions for data visualization and dimension reduction: another application of Stein's lemma, Journal of the American Statistical Association, 87, 1025-1039. DOI
11	Li KC (1991). Sliced inverse regression for dimension reduction, Journal of the American Statistical Association, 86, 316-327. DOI
12	Lo Y, Mendell NR, and Rubin DB (2001). Testing the number of components in a normal mixture, Biometrika, 88, 767-778. DOI
13	Lo Y (2005). Likelihood ratio tests of the number of components in a normal mixture with unequal variances, Statistics & Probability Letters, 71, 225-235. DOI
14	Meyer CD (2000). Matrix Analysis and Applied Linear Algebra, Society for Industrial and Applied Mathematics, Philadelphia, PA.
15	Paisley J and Carin L (2009). Nonparametric factor analysis with beta process priors. In Proceedings of the 26th Annual International Conference on Machine Learning(pp. 777-784), ACM, New York.
16	Jolliffe IT (2002). Principal Component Analysis(2nd ed), Springer, New York.
17	Scrucca L (2011). Model-based SIR for dimension reduction, Computational Statistics & Data Analysis, 55, 3010-3026. DOI
18	Seo B and Kim D (2012). Root selection in normal mixture models, Computational Statistics & Data Analysis, 56, 2454-2470. DOI
19	Tipping ME and Bishop CM (1999a). Probabilistic principal component analysis, Journal of the Royal Statistical Society Series B (Statistical Methodology), 61, 611-622. DOI
20	Tipping ME and Bishop CM (1999b). Mixtures of probabilistic principal component analyzers, Neural Computation, 11, 443-482. DOI
21	Li B, Zha H, and Chiaromonte F (2005). Contour regression: a general approach to dimension reduction, Annals of Statistics, 33, 1580-616. DOI
22	Vidal R (2011). Subspace clustering, IEEE Signal Processing Magazine, 28, 52-68. DOI
23	Whittle P (1952). On principal components and least square methods of factor analysis, Scandinavian Actuarial Journal, 36, 223-239.
24	Young G (1941). Maximum likelihood estimation and factor analysis, Psychometrika, 6, 49-53. DOI
25	Lawley DN (1953). A modified method of estimation in factor analysis and some large sample results. In Uppsala Symposium on Psychological Factor Analysis (pp. 35-42), Munksgaards, Copenhagen.