DOI QR코드

DOI QR Code

K-Means Clustering in the PCA Subspace using an Unified Measure

통합 측도를 사용한 주성분해석 부공간에서의 k-평균 군집화 방법

  • Yoo, Jae-Hung (Dept. of Computer Engineering, Chonnam Nat. Univ.)
  • 류재흥 (전남대학교 컴퓨터공학과)
  • Received : 2022.06.30
  • Accepted : 2022.08.17
  • Published : 2022.08.31

Abstract

K-means clustering is a representative clustering technique. However, there is a limitation in not being able to integrate the performance evaluation scale and the method of determining the minimum number of clusters. In this paper, a method for numerically determining the minimum number of clusters is introduced. The explained variance is presented as an integrated measure. We propose that the k-means clustering method should be performed in the subspace of the PCA in order to simultaneously satisfy the minimum number of clusters and the threshold of the explained variance. It aims to present an explanation in principle why principal component analysis and k-means clustering are sequentially performed in pattern recognition and machine learning.

k-평균 군집화는 대표적인 클러스터링 기법이다. 하지만 성능 평가 척도와 최소 개수의 군집을 정하는 방법에 대하여 통합하지 못한 한계가 있다. 본 논문에서는 수치적으로 최소 개수의 군집을 정하는 방법을 도입한다. 설명된 분산을 통합측도로 제시한다. 최소 개수의 군집과 설명된 분산 달성을 동시에 만족하려면 주성분 해석의 부공간에서 k-평균 군집화 방법을 수행해야한다는 것을 제시하고자 한다. 패턴인식과 기계학습에서 왜 주성분 분석과 k-평균 군집화를 순차적으로 수행하는가에 대한 설명을 원론적으로 제시한다.

Keywords

References

  1. R. Duda and P. Hart, Pattern Classification and Scene Analysis. New York: John Wiley & Sons, 1973.
  2. R. L. Thorndike, "Who belongs in the family?," Psychometrika, vol. 18, no. 4, 1953 pp. 267-276. https://doi.org/10.1007/BF02289263
  3. R. C. Gonzalez and R. E. Woods, Digital Image Processing. Reading, MA: Addison-Wesley, 1992.
  4. S. Cen, J. Yoo, and C. Lim, "Electricity Pattern Analysis by Clustering Domestic Load Profiles Using Discrete Wavelet Transform," Energies. vol. 15. no. 4, 2022, pp. 1350(1-18). https://doi.org/10.3390/en15041350
  5. Y. Tong, I. Aliyu, and C. Lim, "Analysis of Dimensionality Reduction Methods Through Epileptic EEG Feature Selection for Machine Learning in BCI," J. of the Korea Institute of Electronic Communication Sciences, vol. 13, no. 6, 2018, pp. 1333-1342. https://doi.org/10.13067/JKIECS.2018.13.6.1333
  6. Y. Kim, S. Park, and D. Kim, "Research on Robust Face Recognition against Lighting Variation using CNN," J. of the Korea Institute of Electronic Communication Sciences, vol. 12, no. 2, 2017, pp. 325-330. https://doi.org/10.13067/JKIECS.2017.12.2.325
  7. J. Kim and C. Kim, "Image Retrieval System of semantic Inference using Objects in Images," J. of the Korea Institute of Electronic Communication Sciences, vol. 11, no. 7, 2016, pp. 677-684. https://doi.org/10.13067/JKIECS.2016.11.7.677
  8. J. Park and S. Lee, "An Image Processing Mechanism for Disease Detection in Tomato Leaf," J. of the Korea Institute of Electronic Communication Sciences, vol. 4, no. 5, 2019, pp. 959-968.
  9. B. Kim, H. Yoon, and J. Lee, "A Study on the Distribution of Cold Water Occurrence using K-Means Clustering," J. of the Korea Institute of Electronic Communication Sciences, vol. 16, no. 2, 2021, pp. 371-378. https://doi.org/10.13067/JKIECS.2021.16.2.371
  10. C. Lee, "Enhancement of the k-Means Clustering Speed by Emulation of Birds' Motion in Flock," J. of the Korea Institute of Electronic Communication Sciences, vol. 9, no. 9, 2014, pp. 965-970. https://doi.org/10.13067/JKIECS.2014.9.9.965
  11. C. Lee, "The Effect of the Number of Phoneme Clusters on Speech Recognition," J. of the Korea Institute of Electronic Communication Sciences, vol. 9, no. 11, 2014, pp. 1221-1226. https://doi.org/10.13067/JKIECS.2014.9.11.1221
  12. R. A. Fisher, "The use of multiple measurements in taxonomic problems," Annals of Eugenics. vol. 7, no. 2, 1936, pp. 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
  13. D. Dua and C. Graff, "UCI Machine Learning Repository[http://archive.ics.uci.edu/ml]," University of California, School of Information and Computer Science, Irvine, CA., 2019.
  14. J. Yoo, "A Unified Bayesian Tikhonov Regularization Method for Image Restoration," J. of the Korea Institute of Electronic Communication Sciences, vol. 11, no. 11, 2016, pp. 1129-1134. https://doi.org/10.13067/JKIECS.2016.11.11.1129
  15. J. Yoo, "An Extension of Unified Bayesian Tikhonov Regularization Method and Application to Image Restoration," J. of the Korea Institute of Electronic Communication Sciences, vol. 15, no. 11, 2020, pp. 161-166.