Empirical Comparisons of Clustering Algorithms using Silhouette Information

Jun, Sung-Hae;Lee, Seung-Joo;

doi:10.5391/IJFIS.2010.10.1.031

International Journal of Fuzzy Logic and Intelligent Systems

Volume 10 Issue 1
/
Pages.31-36
/
2010
/
1598-2645(pISSN)
/
2093-744X(eISSN)

Korean Institute of Intelligent Systems (한국지능시스템학회)

DOI QR Code

Empirical Comparisons of Clustering Algorithms using Silhouette Information

Jun, Sung-Hae (Department of Bioinformatics & Statistics, Cheongju University) ;
Lee, Seung-Joo (Department of Bioinformatics & Statistics, Cheongju University)

Received : 2009.08.30
Accepted : 2010.01.10
Published : 2010.03.25

https://doi.org/10.5391/IJFIS.2010.10.1.031 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.

Keywords

References

J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001.
P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Addison Wesley, 2006.
A. S. Pandya, R. B. Macy, Pattern Recognition with Neural Networks in C++, IEEE Press, 1995.
S. H. Jun, “An Optimal Clustering using Hybrid Self Organizing Map”, International Journal of Fuzzy Logic and Intelligent Systems, vol. 6, no. 1, pp. 10-14, 2006. https://doi.org/10.5391/IJFIS.2006.6.1.010
M. J. Park, S. H. Jun, K. W. Oh, “Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm”, International Journal of Fuzzy Logic and Intelligent Systems, vol. 13, no. 1, pp. 12-17, 2003. https://doi.org/10.5391/JKIIS.2003.13.1.012
UCI ML Repository, http://archive.ics.uci.edu/ml/
P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied mathematics, vol. 20, pp. 53-65, 1987. https://doi.org/10.1016/0377-0427(87)90125-7
B. S. Everitt, S. Landau, M. Leese, Cluster Analysis, Arnold, 2001.
M. Maechler, Cluster Analysis Extended Rousseeuw et al., Package cluster, 2009.
T. M. Mitchell, Machine Learning, McGraw-Hill, 1997.
A. K. Jain, M. N. Murty, P. J. Flynn, “Data clustering: a review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999. https://doi.org/10.1145/331499.331504
D. Dumitrescu, B. Lazzerini, L. C. Jain, Fuzzy Sets and Their Application to Clustering and Training, CRC Press, 2000.
The R Project for Statistical Computing, www.rproject.org
R. Xu, D. Wunsch II, “Survey of clustering algorithms,” IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 645-678, 2005. https://doi.org/10.1109/TNN.2005.845141
I. Oh, Pattern Recognition, Kyobo, 2008.
R. C. Dubes, “How many clusters are best? - an experiment,” Pattern Recognition, vol. 20, no. 6, pp. 645-663, 1987. https://doi.org/10.1016/0031-3203(87)90034-3
A. R. Liddle, “Information criteria for astrophysical model selection,” Monthly Notices of the Royal Astronomical Society: Letters, vol. 377, iss. 1, pp. L74-L78, 2008.
Q. Zhao, V. Hautamaki, P. Franti, “Knee Point Detection in BIC for Detecting the Number of Clusters,” Lecture Notes in Computer Science, vol. 5259, pp. 664-673, 2008. https://doi.org/10.1007/978-3-540-88458-3_60

International Journal of Fuzzy Logic and Intelligent Systems

Empirical Comparisons of Clustering Algorithms using Silhouette Information

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)