Browse > Article
http://dx.doi.org/10.4218/etrij.14.0113.0553

Data Clustering Method Using a Modified Gaussian Kernel Metric and Kernel PCA  

Lee, Hansung (SW.Content Research Laboratory, ETRI)
Yoo, Jang-Hee (SW.Content Research Laboratory, ETRI)
Park, Daihee (Department of Computer and Information Science, Korea University)
Publication Information
ETRI Journal / v.36, no.3, 2014 , pp. 333-342 More about this Journal
Abstract
Most hyper-ellipsoidal clustering (HEC) approaches use the Mahalanobis distance as a distance metric. It has been proven that HEC, under this condition, cannot be realized since the cost function of partitional clustering is a constant. We demonstrate that HEC with a modified Gaussian kernel metric can be interpreted as a problem of finding condensed ellipsoidal clusters (with respect to the volumes and densities of the clusters) and propose a practical HEC algorithm that is able to efficiently handle clusters that are ellipsoidal in shape and that are of different size and density. We then try to refine the HEC algorithm by utilizing ellipsoids defined on the kernel feature space to deal with more complex-shaped clusters. The proposed methods lead to a significant improvement in the clustering results over K-means algorithm, fuzzy C-means algorithm, GMM-EM algorithm, and HEC algorithm based on minimum-volume ellipsoids using Mahalanobis distance.
Keywords
Data clustering; hyper-ellipsoidal clustering; minimum-volume ellipsoids; kernel PCA; Gaussian kernel;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 D. Tax and P. Juszczak, "Kernel Whitening for One-Class Classification," Proc. Pattern Recogn. Support Vector Mach., Niagara Fall, Canada, vol. 2388, Aug. 10, 2002, pp. 40-52.
2 M. Grant and S. Boyd, "Graph Implementations for Nonsmooth Convex Programs," Recent Advances Learning Contr. LNCIS, vol. 371, 2008, pp. 95-110.
3 D. Cai, X. He, and J. Han, "Document Clustering Using Locality Preserving Indexing," IEEE Trans. KDE, vol. 17, no. 12, Dec. 2005, pp. 1624-1637.
4 The machine learning dataset of UCI are available. Accessed Jan. 2013. http://archive.ics.uci.edu/ml/
5 CVX Research, Inc. CVX: Matlab Software for Disciplined Convex Programming, version 2.0. Accessed Apr. 2013. http://cvxr.com/cvx
6 W. Song et al., "Comments on a Self-Organizing Network for Hyper-ellipsoidal Clustering (HEC)," IEEE Trans. Neural Netw., vol. 8, no. 6, Nov. 1997, pp. 1561-1563.   DOI   ScienceOn
7 H. Ichihashi, M. Ohue, and T. Miyoshi, "Fuzzy C-Means Clustering Algorithm with Pseudo Mahalanobis Distances," Proc. Third Asian Fuzzy Syst. Symp., Changwon, Rep. of Korea, June 18-21, 1998, pp. 148-152.
8 R. Krishnapuram and J. Kim, "A Clustering Algorithm Based on Minimum Volume," IEEE Int. Conf. Fuzzy Syst., vol. 2, New Orleans, LA, USA, Sept. 8-11, 1996, pp. 1387-1392.
9 H. Lee, J. Park, and D. Park, "Hyper-ellipsoidal Clustering Algorithm Using Linear Matrix Inequality," J. Korea Institute Intell. Syst., vol. 12, no. 4, Aug. 2002, pp. 300-305.   DOI
10 M. Kumar and J.B. Orlin, "Scale-Invariant Clustering with Minimum Volume Ellipsoids," Comput. Operations. Res., vol. 35, no. 4, Apr. 2008, pp. 1017-1029.   DOI   ScienceOn
11 M. Moshtaghi et al., "An Efficient Hyperellipsoidal Clustering Algorithm for Resource-Constrained Environments," Pattern Recogn., vol. 44, no. 9, Sept. 2011, pp. 2197-2209.   DOI   ScienceOn
12 R. Shioda and L. Tuncel, "Clustering via Minimum Volume Ellipsoids," Comput. Optim. Appl., vol. 37, no. 3, July 2007, pp. 247-295.   DOI
13 S. Boyd and L. Vandenberghe, Convex Optimization, 1st ed., Cambridge, UK: Cambridge University Press, 2004.
14 M.J. Todd and E.A. Yildirim, "On Khachiyan's Algorithm for the Computation of Minimum-Volume Enclosing Ellipsoids," Discr. Appl. Math., vol. 155, no. 13, Aug. 15, 2007, pp. 1731-1744.   DOI   ScienceOn
15 N. Moshtagh, Minimum Volume Enclosing Ellipsoids, Tech. report, the School of Engineering and Applied Science, Univ. Pennsylvania, PA, 2005.
16 P. Kumar and E.A. Yildirim, "Minimum Volume Enclosing Ellipsoids and Core Set," J. Optim. Theory Appl., vol. 126, no. 1, July 2005, pp. 1-21.   DOI
17 J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis, Cambridge, UK: Cambridge University Press, 2004.
18 B. Scholkopf, A. Smola, and K.-R. Muller, "Kernel Principal Component Analysis," Proc. ICANN, LNCS, Lausanne, Switzerland, vol. 1327, Oct. 8-10, 1997, pp. 583-588.
19 A.K. Jain, "Data Clustering: 50 Years Beyond K-means," Pattern Recognition Lett., vol. 31, no. 8, June 2010, pp. 651-666.   DOI   ScienceOn
20 J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2nd ed., 2007.
21 J. Mao and A.K. Jain, "A Self-Organizing Network for Hyperellipsoidal Clustering (HEC)," IEEE Trans. Neural Netw., vol. 7, no. 1, Jan. 1996, pp. 16-29.   DOI   ScienceOn
22 W. Song et al., "The Hyperellipsoidal Clustering Using Genetic Algorithm," IEEE Int. Conf. Intell. Process. Syst., Beijing, China, Oct. 28-31, 1997, pp. 592-596.
23 J. Cao et al., "A Max-Flow-Based Similarity Measure for Spectral Clustering," ETRI J., vol. 35, no. 2, Apr. 2013, pp. 311-320.   DOI   ScienceOn