Improvement of K-means Clustering Through Particle Swarm Optimization

입자 군집 최적화 알고리즘을 통한 K-평균 군집화 개선

  • Kyeong Chae Yang (Graduate School of Consulting, Kumoh National Institute of Technology) ;
  • Minje Kim (Department of Consulting Graduate School, Kumoh National Institute of Technology) ;
  • Jonghwan Lee (Department of Consulting Graduate School, Kumoh National Institute of Technology)
  • 양경채 (금오공과대학교 컨설팅대학원) ;
  • 김민제 (금오공과대학교 산업공학부) ;
  • 이종환 (금오공과대학교 산업공학부)
  • Received : 2024.07.15
  • Accepted : 2024.09.12
  • Published : 2024.09.30

Abstract

Unsupervised learning is a type of machine learning, and unlike supervised learning or reinforcement learning, a target value for input value is not given. Clustering is mainly used for such unsupervised learning. One of the representative methods of such clustering is K-means clustering. Since K-means clustering is a method of determining the number of clusters and continuing to find the central point of the data allocated to the cluster, there is a problem that the clustered group may not be the optimal cluster. In this study, particle swarm optimization algorithm, which determines the motion vector by adding various variables as well as the center point, is applied to K-means clustering. The improved K-means clustering makes it possible to move toward better outcome values even when the center of cluster no longer change. In the conventional clustering method, the center of the cluster moves to the center of the data belonging to the cluster, and clustering ends when the cluster does not change, so other characteristics other than the center value are excluded. Unlike the conventional clustering method, the improved clustering method uses a central value, an average value, and a random value as variables, and a particle swarm optimization algorithm that modifies the vector for each iteration is applied. As a result, improved clustering method derived a better result value than the existing clustering method in the group's fitness index, silhouette score.

Keywords

Acknowledgement

This paper was supported by Kumoh National Institute of Technology (2022~2024).

References

  1. A. Geron, "Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow 2nd ed", O'Reilly Media, sebastopol CA, pp.299-315, 2020
  2. Kyowon Jeong, Hanho Wang, "Higher-order Modulation Signal Detection Scheme Using Sequential Clustering", Journal of Korean Institute of Information Technology, Vol. 17, No. 3, pp.87-93
  3. MACQUEEN James, et al, "Some methods for classification and analysis of multivariate observations", In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1, No. 14, pp.281-297, 1967
  4. WAGSTAFF, Kiri, et al, "Constrained k-means clustering with background knowledge", In: ICML, Vol. 1, pp.577-584, 2001
  5. KENNEDY James, EBERHART Russell, "Particle swarm optimization", In: Proceedings of ICNN'95- International Conference on Neural Networks, IEEE, Vol. 4, pp.1942-1948, 1995
  6. CHIH Mingchang, et al, "Particle swarm optimization with time-varying acceleration coefficients for the multidimensional knapsack problem", Applied Mathematical Modelling, Vol. 38, No. 4, pp.1338-1350, 2014
  7. FIGUEIREDO Mario A. T., JAIN Anil K., "Unsupervised learning of finite mixture models", IEEE Transactions on pattern analysis and machine intelligence, Vol. 24, No. 3, pp.381-396, 2002
  8. Hyunjoong Kim, "Unsupervised Korean Tokenizer and Extractive Document Summarization to Solve Out-ofVocabulary and Dearth of Data", Doctoral Dissertation, Seoul National University, 2019
  9. Sejun Kim, et al, "Load Balancing Technique for Distributed SDN using K-means Clustering and Harmony Search Algorithm", Journal of The Korea Society of Computer and Information, Vol. 27, No.1, pp.29-30, 2019
  10. WANG Kang-Ping, et al., "Particle swarm optimization for traveling salesman problem", In: Proceedings of the 2003 international conference on machine learning and cybernetics (IEEE cat. no. 03ex693), IEEE, Vol.3, pp.1583-1585, 2003
  11. GHAHRAMANI Zoubin, "Unsupervised learning", In: Summer School on Machine Learning, Springer, Berlin, Heidelberg, pp.72-112, 2003
  12. DAYAN Peter, SAHANI Maneesh, DEBACK Gregoire, "Unsupervised learning" The MIT encyclopedia of the cognitive sciences, pp.857-859, 1999
  13. P. Bholowalia, A. Kumar, "EBK-Means: A Clustering Technique based on Elbow Method and K-Means in WSN", International Journal of Computer Applications (0975-8887) Vol. 105, No. 9, 2014
  14. RALAMBONDRAINY Henri, "A conceptual version of the K-means algorithm", Pattern Recognition Letters, Vol. 16, No. 11, pp.1147-1157, 1995
  15. Sukho Kang, Seung Kim, "Particle 2-Swarm Optimization for Robust Search", 2008
  16. ARANGANAYAGI S., THANGAVEL K., "Clustering categorical data using silhouette coefficient as a relocating measure", In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), IEEE, Vol. 2, pp.13-17, 2007
  17. ROUSSEEUW Peter J, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis", Journal of computational and applied mathematics, Vol. 20, pp.53-65, 1987
  18. Daeyeong Hong, Kyuseok Shim, "A Differentially Private K-Means Clustering Based on the Voronoi Diagram", Journal of the Korean Institute of Information Scientists and Engineers, pp.130-132, 2019
  19. Taecheon An, Kyungwon Jang, Dongdu Shin, "Numerical Comparisons of PSO with GA for the Dimensionality and Characteristics", Journal of Institute of Control, Robotics and Systems, and Systems Conference, pp. 777-782, 2006
  20. Chang Hyun Kim et al., "Multi-Order Processing System for Smart Warehouse Using Mutant Ant Colony Optimization", Journal of the Semiconductor & Display Technology, Vol. 22, No. 3, pp.36-40, 2023