[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2020.27.4.431

On hierarchical clustering in sufficient dimension reduction

Yoo, Chaeyeon (Department of Statistics, Ewha Womans University)
Yoo, Younju (Department of Statistics, Ewha Womans University)
Um, Hye Yeon (Department of Statistics, Ewha Womans University)
Yoo, Jae Keun (Department of Statistics, Ewha Womans University)

Publication Information

Communications for Statistical Applications and Methods / v.27, no.4, 2020 , pp. 431-443 More about this Journal

Abstract

The K-means clustering algorithm has had successful application in sufficient dimension reduction. Unfortunately, the algorithm does have reproducibility and nestness, which will be discussed in this paper. These are clear deficits for the K-means clustering algorithm; however, the hierarchical clustering algorithm has both reproducibility and nestness, but intensive comparison between K-means and hierarchical clustering algorithm has not yet been done in a sufficient dimension reduction context. In this paper, we rigorously study the two clustering algorithms for two popular sufficient dimension reduction methodology of inverse mean and clustering mean methods throughout intensive numerical studies. Simulation studies and two real data examples confirm that the use of hierarchical clustering algorithm has a potential advantage over the K-means algorithm.

Keywords

central subspace; hierarchical clustering; informative predictor subspace; K-means clustering; multivariate slicing; sufficient dimension reduction;

Citations & Related Records

Reference

1	Cook RD and Weisberg S (1991). Comment: Sliced inverse regression for dimension reduction by KC Li, Journal of the American Statistical Association, 86, 328-332.
2	Hastie T, Tibshirani R, and Friedman J (2008). The Elements of Statistical Learning (2nd ed.), Springer, New York.
3	Lee K, Choi Y, Um H, and Yoo JK (2019). On fused dimension reduction in multivariate regression, Chemometrics and Intelligent Laboratory Systems, 193, 103828. DOI
4	Li L, Cook RD, and Nachtsheim CJ (2004). Cluster-based estimation for sufficient dimension reduction, Computational Statistics and Data Analysis, 47, 175-193. DOI
5	Li KC (1991). Sliced inverse regression for dimension reduction, Journal of the American Statistical Association, 86, 316-327. DOI
6	Setodji CM and Cook RD (2004). K-means inverse regression, Technometrics, 46, 421-429. DOI
7	Yoo JK (2009). Iterative optimal sufficient dimension reduction for the conditional mean in multivariate regression, Journal of Data Science, 7, 267-276.
8	Yoo JK (2016a). Tutorial: Dimension reduction in regression with a notion of sufficiency, Communications for Statistical Applications and Methods, 23, 93-103. DOI
9	Yoo JK (2016b). Tutorial: Methodologies for sufficient dimension reduction in regression. Communications for Statistical Applications and Methods, 23, 95-117.
10	Yoo JK (2016c). Sufficient dimension reduction through informative predictor subspace. Statistics : A Journal of Theoretical and Applied Statistics, 50, 1086-1099.
11	Yoo JK (2018). Partial least squares fusing unsupervised learning. Chemometrics and Intelligent Laboratory Systems, 175, 82-86. DOI
12	Yoo JK, Lee K, and Woo S (2010). On the extension of sliced average variance estimation to multivariate regression, Statistical Methods and Applications, 19, 529-540. DOI