DOI QR코드

DOI QR Code

Performance Improvement of Fuzzy C-Means Clustering Algorithm by Optimized Early Stopping for Inhomogeneous Datasets

  • Chae-Rim Han (Department of Convergence Security Engineering, Sungshin Women's University) ;
  • Sun-Jin Lee (Department of Future Convergence Technology Engineering Sungshin Women's University) ;
  • Il-Gu Lee (Department of Convergence Security Engineering, Sungshin Women's University)
  • 투고 : 2023.01.05
  • 심사 : 2023.08.10
  • 발행 : 2023.09.30

초록

Responding to changes in artificial intelligence models and the data environment is crucial for increasing data-learning accuracy and inference stability of industrial applications. A learning model that is overfitted to specific training data leads to poor learning performance and a deterioration in flexibility. Therefore, an early stopping technique is used to stop learning at an appropriate time. However, this technique does not consider the homogeneity and independence of the data collected by heterogeneous nodes in a differential network environment, thus resulting in low learning accuracy and degradation of system performance. In this study, the generalization performance of neural networks is maximized, whereas the effect of the homogeneity of datasets is minimized by achieving an accuracy of 99.7%. This corresponds to a decrease in delay time by a factor of 2.33 and improvement in performance by a factor of 2.5 compared with the conventional method.

키워드

과제정보

This work was partially supported by the Korean Government (MOTIE) (P0008703, the Competency Development Program for Industry Specialist), and the MSIT under the ICAN (ICT Challenge and Advanced Network of HRD) program (No. IITP-2022-RS-2022-00156310) supervised by the Institute of Information & Communication Technology Planning & Evaluation (IITP).

참고문헌

  1. C. Corneanu, M. Madai, S. Escalera, and A. Martinez, "Explainable early stopping for action unit recognition," in IEEE 15th International Conference on Automatic Face and Gesture Recognition (FG 2020), Buenos Aires, Argentina, pp. 693-699, 2020. DOI: 10.1109/ FG47880.2020.00080.
  2. F. Lauer and G. Bloch, "Ho-Kashyap classifier with early stopping for regularization," Pattern Recognition Letters, vol. 27, no. 9, pp. 1037-1044, Jul. 2006. DOI: 10.1016/j.patrec.2005.12.009.
  3. S.-E. Jeon, S.-J. Lee, and I.-G. Lee, "Hybrid in-network computing and large-scale data processing," Elsevier Computer Networks, vol. 226, p. 109686, May 2023. DOI: 10.1016/j.comnet.2023.109686.
  4. H. Demuth, M. Beale, and M. Hagan, "Neural network toolbox for use with MATLAB user's guide," in The MathWorks Inc, 6th ed. Natick, MA: 2008.
  5. G. Manogaran, P. M. Shakeel, S. Baskar, C. -H. Hsu, S. N. Kadry, R. Sundarasekar, P. M. Kumar, and B. A. Muthu, "FDM: Fuzzyoptimized data management technique for improving big data analytics," IEEE Transactions on Fuzzy Systems, vol. 29, no. 1, pp. 177-185, Jan. 2021. DOI: 10.1109/TFUZZ.2020.3016346.
  6. L. Prechelt, "Automatic early stopping using cross validation: quantifying the criteria," Neural networks, vol. 1, no. 4, pp. 761-767, Jun. 1998. DOI: 10.1016/S0893-6080(98)00010-0.
  7. M. Elkano, J. A. Sanz, E. Barrenechea, H. Bustince, and M. Galar, "CFM-BD: A distributed rule induction algorithm for building compact fuzzy models in big data classification problems," IEEE Transactions on Fuzzy Systems, vol. 28, no. 1, pp. 163-177, Jan. 2020. DOI: 10.1109/TFUZZ.2019.2900856.
  8. F. D. Foresee and M. T. Hagan, "Gauss-Newton approximation to Bayesian regularization," in Proceedings of the 1997 International Joint Conference on Neural Networks, Houston, USA, pp. 1930- 1935, 1997. DOI: 10.1109/ICNN.1997.614194.
  9. J. Gao, C. Huo, Y. Zhen, and G. Zhang, "Study on block and hierarchical division control of power internet of things," in IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC), Beijing, China, pp. 18-21, 2020. DOI: 10.1109/ICEIEC49280.2020.9152320.
  10. H. Shi, J. Yan, M. Ding, T. Gao, S. Li, Z. Zhang, and Z. Li, "An  improved fuzzy c-means soft clustering based on density peak for wind power forecasting data processing," in Asia Energy and Electrical Engineering Symposium, Chengdu, China, pp. 801-804, 2020. DOI: 10.1109/AEEES48850.2020.9121374.
  11. J. C. Bezdek and J. C. Dunn, "Optimal fuzzy partitions: A heuristic for estimating the parameters in a mixture of normal distributions," IEEE Transactions on Computers, vol. C-24, no. 8, pp. 835-838, Aug. 1975. DOI: 10.1109/T-C.1975.224317.
  12. C. R. Mardiantien, I. Atastina, and I. Asror, "Product segmentation based on sales transaction data using agglomerative hierarchical clustering and FMC model," in IEEE 3rd International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, pp. 280-285, 2020. DOI: 10.1109/ICOIACT50329.2020. 9332023.
  13. M. Ahmed and A. Barkat, "Performance analysis of hard clustering techniques for big iot data analytics," in Cybersecurity and Cyberforensics Conference (CCC), Melbourne, Australia, pp. 62-66, 2019. DOI: 10.1109/CCC.2019.000-8.
  14. Y. Rong and Y. Liu, "Staged text clustering algorithm based on Kmeans and hierarchical agglomeration clustering," in IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), Dalian, China, pp. 124-127, 2020. DOI: 10.1109/ICAICA 50127.2020.9182394.
  15. A. Bechini, F. Marcelloni, and A. Renda, "TSF-DBSCAN: A novel fuzzy density-based approach for clustering unbounded data streams," IEEE Transactions on Fuzzy Systems, vol. 30, no. 3, pp. 623-637, Mar. 2022. DOI: 10.1109/TFUZZ.2020.3042645.
  16. J. Leskovec, A. Rajaraman, and J. Ullman. "Mining of massive datasets," in Cambridge University Press, ch. 7, 2020.
  17. J. Gao, C. Huo, Y. Zhen, and G. Zhang, "Study on block and hierarchical division control of power internet of things," in IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC), Beijing, China, pp. 18-21, 2020. DOI: 10.1109/ICEIEC49280.2020.9152320.
  18. X. Ying, "An overview of overfitting and its solutions," Journal of Physics: Conference Series, vol. 1168, no. 2, p. 022022, Feb. 2019. DOI: 10.1088/1742-6596/1168/2/022022.
  19. R. Padilla, W. L. Passos, T. L. Dias, S. L. Netto, and E. A. B. da Silva, "A comparative analysis of object detection metrics with a companion open-source toolkit," Electronics, vol. 10, no. 3, p. 279, Jan. 2021. DOI: 10.3390/electronics10030279.
  20. W. Lixin, T. Xuejing, W. Hongrui, and S. Yang, "Identification method of fuzzy inference system based on improved fuzzy clustering arithmetic," Control and Decision, vol. 22, pp. 77-79, 2008. https://doi.org/10.1109/CHICC.2008.4605624
  21. J. C. Bezdek, "Pattern recognition with fuzzy objective function algorithms," Advanced Applications in Pattern Recognition (AAPR), 1981.
  22. M. S. Salekin, A. B. Jelodar, and R. Kushol, "Cooking state recognition from images using inception architecture," in 2019 International Conference on Robotics,Electrical and Signal Processing Techniques (ICREST), Dhaka, Bangladesh, pp. 163-168, 2019. DOI: 10.1109/ICREST.2019.8644262.
  23. N. Kang, J. Kang, and H. -S. Yong, "Performance comparison of clustering techniques for spatio-temporal data," Journal of the Korea Intelligent Information Systems, vol. 10, no. 2, pp. 15-37, 2004.
  24. J. C. Bezdek, R. Ehrlich, and W. Full, "FCM: The fuzzy c-means clustering algorithm," Computers & Geosciences, vol. 10, no. 2-3, pp. 191-203, 1984. DOI: 10.1016/0098-3004(84)90020-7.
  25. S. Huang, H. Dang, R. Jiang, Y. Hao, C. Xue, and W. Gu, "Multilayer hybrid fuzzy classfication based on SVM and improved PSO for speech emotion recognition," Electronics, vol. 10, no. 23, p. 2891, Nov. 2021. DOI: 10.3390/electronics10232891.
  26. H. Fang, J. G. Huang, and F. H. Chu. "Grey relation evaluation model of weapon system based on rough set," Acta Armamentarii, vol. 29, no. 2, pp. 253-256, 2008.
  27. A. L. De and C. A. Guo, "An image segmentation method based on the fusion of vector quantization and edge detection with applications to medical image processing," International Journal of Machine Learning and Cybernetics, vol. 5, pp. 543-551, Aug. 2014. DOI: 10.1007/s13042-013-0205-1.
  28. D. S. Dimitrova, V. K. Kaishev, and S. Tan, "Computing the kolmogorov-smirnov distribution when the underlying cdf is purely discrete, mixed or continuous," Journal of Statistical Software, vol. 95, no. 10, pp. 1-42, Oct. 2020. DOI: 10.18637/jss. v095.i10.
  29. E. W. Weisstein. Correlation Coefficient. MathWorld--A Wolfram Web Resource. [Online] Available: https://mathworld.wolfram.com/ CorrelationCoefficient.htm