DOI QR코드

DOI QR Code

Nearest neighbor and validity-based clustering

  • Son, Seo H. (Top Engineering Co., Ltd.) ;
  • Seo, Suk T. (Department of Electrical Engineering, Yeungnam University) ;
  • Kwon, Soon H. (Department of Electrical Engineering, Yeungnam University)
  • Published : 2004.12.01

Abstract

The clustering problem can be formulated as the problem to find the number of clusters and a partition matrix from a given data set using the iterative or non-iterative algorithms. The author proposes a nearest neighbor and validity-based clustering algorithm where each data point in the data set is linked with the nearest neighbor data point to form initial clusters and then a cluster in the initial clusters is linked with the nearest neighbor cluster to form a new cluster. The linking between clusters is continued until no more linking is possible. An optimal set of clusters is identified by using the conventional cluster validity index. Experimental results on well-known data sets are provided to show the effectiveness of the proposed clustering algorithm.

Keywords

References

  1. M. R. Anderberg, Cluster Analysis for Application, Academic Press, New York, 1973
  2. R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, Wiley, New York, 1973
  3. J. A. Hartigan, Clustering Algorithms, Wiley, New York, 1975
  4. J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981
  5. A. K. Jain, R. C. Dubes, Algorithms for clustering, Prentice-Hall, Englewood Cliffs, NJ, 1988
  6. L. Kaufmann P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York, 1990
  7. B. S. Everitt, Cluster Analysis, 3rd Ed., Edward Arnold, London, 1993
  8. F. Hoppner, F. Klawonn, R. Kruse and T. Runkler, Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition, Wiley, New York, 1999
  9. S. H. Kwon, 'Threshold selection based on cluster analysis,' Pattern Recognition Letters, Vol. 25, pp. 1045- I050, 2004 https://doi.org/10.1016/j.patrec.2004.03.001
  10. J. C. Dunn, 'A fuzzy relative of the ISODATA process and its use in detecthg compact well-separated clusters,' J. Cybern. Vol. 3, pp. 32-57, 1973 https://doi.org/10.1080/01969727308546046
  11. D. E. Gustafson and W. C. Kessel, 'Fuzzy clustering with a fuzzy covariance matrix,' in Proc. of the IEEE Conf. on Decision Control, San Diego, CA, pp.761-766, 1979
  12. R. Krishnapuram and J. M. Keller, 'A Possibilistic Approach to Clustering,' IEEE Trans. Fuzzy Syst., Vol. 1, No.2, pp.98-110, 1993 https://doi.org/10.1109/91.227387
  13. R. Yager and D. P. Filev, 'Approximate clustering via the mountain method,' IEEE Trans. Syst., Man, and Cybertn., Vol. 24, No.8, pp. 1279-1284, 1994 https://doi.org/10.1109/21.299710
  14. P. J. Rouseeuw, L. Kaufmann, and E. Trauwaert, 'Fuzzy clustering using scatter matrices,' Comput. Statist. Data Anal., Vol. 23, pp. 135-151, 1996 https://doi.org/10.1016/S0167-9473(96)00026-6
  15. N. R. Pal, N. K. Pal and J. C. Bezdek, 'A Mixed c-Means Clustering Model,' in Proc. FUZZ-IEEE'97, pp. 11-21, 1997
  16. R. Krichnapuram and J. Kim, 'Clustering algorithms based on volume criteria,' IEEE Trans. Fuzzy Systems, Vol. 8, pp. 228-236, 2000 https://doi.org/10.1109/91.842156
  17. K. L. Wu, J. Yu, and M. S. Yang, 'A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests,' Pattern Recognition Letters, (to be published)
  18. J. C. Bezdek, 'Cluster validity with fuzzy sets,' J. Cybernet. Vol. 3, No.3, pp. 58-72, 1974
  19. J. C. Bezdek, 'Mathematical models for systematics and taxonomy,' in Proc. 8th Int. Conf Numerical Taxonomy, G. Estabrook, Ed., Freeman, San Franscisco, CA, pp. 143-166, 1975
  20. Y. Fukuyama and M. Sugeno, 'A new method of choosing the number of clusters for the fuzzy c-means method,' in Proc. 5th Fuzzy Syst. Symp., pp. 247-250, 1989 (in Japanese)
  21. X. L. Xie and G. A. Beni, 'Validity measure for fuzzy clustering,' IEEE Trans. Pattern and Machine Intell., Vol. 13, No.8, pp. 841-846, 1991 https://doi.org/10.1109/34.85677
  22. J. C. Bezdek and N. R. Pal, 'Some new indexes of cluster validity,' IEEE Trans. Syst. Man Cybern., Vol. 28, No.3, pp. 301-315, 1998 https://doi.org/10.1109/3477.678624
  23. S. H. Kwon, 'Cluster validity index for fuzzy clustering,' Electronics Letters, Vol. 34, No. 22, pp. 2176-2177, 1998 https://doi.org/10.1049/el:19981523
  24. D. W. Kim, K. H. Lee and D. Lee, 'On cluster validity index for estimation of the optimal number of fuzzy clusters,' Pattern Recognition, Vol. 37, pp. 2009-2025, 2004 https://doi.org/10.1016/j.patcog.2004.04.007
  25. A. M. Bensaid, L. O. Hall, J. C. Bezdek, L. P. Clarke, M. L. Silbiger, J. A. Arrington, and R. F. Murtagh, 'Validity-guided (re)clustering with application to image segmentation,' IEEE Trans. Fuzzy Systems, Vol. 4, No. 2, pp. 112-123, 1996 https://doi.org/10.1109/91.493905

Cited by

  1. Density and Frequency-Aware Cluster Identification for Spatio-Temporal Sequence Data vol.93, pp.1, 2017, https://doi.org/10.1007/s11277-016-3937-x