DOI QR코드

DOI QR Code

Spectral clustering: summary and recent research issues

스펙트럴 클러스터링 - 요약 및 최근 연구동향

  • Jeong, Sanghun (Department of Statistics, Pusan National University) ;
  • Bae, Suhyeon (Department of Statistics, Pusan National University) ;
  • Kim, Choongrak (Department of Statistics, Pusan National University)
  • Received : 2020.02.19
  • Accepted : 2020.03.09
  • Published : 2020.04.30

Abstract

K-means clustering uses a spherical or elliptical metric to group data points; however, it does not work well for non-convex data such as the concentric circles. Spectral clustering, based on graph theory, is a generalized and robust technique to deal with non-standard type of data such as non-convex data. Results obtained by spectral clustering often outperform traditional clustering such as K-means. In this paper, we review spectral clustering and show important issues in spectral clustering such as determining the number of clusters K, estimation of scale parameter in the adjacency of two points, and the dimension reduction technique in clustering high-dimensional data.

K-평균 클러스터링은 매우 널리 사용되고 있으나 유사도가 구면체 또는 타원체로 정의되어 각 클러스터가 볼록 집합 형태인 자료에는 좋은 결과를 주지만 그렇지 않은 경우에는 매우 형편 없는 결과를 나타낸다. 스펙트럴 클러스터링은 K-평균 클러스터링의 단점을 잘 보완해 줄 뿐아니라 여러 형태의 자료나 고차원 자료 등에 대해서도 좋은 결과를 나타내서 최근 인공 신경망 모형에 많이 이용되고 있다. 하지만, 개선되어야 할 단점도 여전히 많다. 본 논문에서는 스펙트럴 클러스터링에 대해 알기 쉽게 소개하고, 클러스터 갯수의 추정, 척도모수의 추정, 고차원 자료의 차원 축소 등 스펙트럴 클러스터링에 대한 최근의 연구 동향을 소개한다.

Keywords

References

  1. Ben-Hur, A., Horn, D., Siegelmann, H. T., and Vapnik, V. (2001). Support vector clustering, Journal of Machine Learning Research, 2, 125-137.
  2. Fiedler, M. (1973). Algebraic connectivity of graphs, Czechoslovak Mathematical Journal, 23, 298-305 https://doi.org/10.21136/CMJ.1973.101168
  3. Fraley, C. and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation, Journal of the American Statistical Association, 97, 611-631. https://doi.org/10.1198/016214502760047131
  4. Hastie, T., Tibshirani, R., and Friedman, J. (2008). The Elements of Statistical Learning (2nd Ed), Springer, New York.
  5. Kim, C., Cheon, M., Kang, M., and Chang, I. (2008). A simple and exact Laplacian clustering of complex networking phenomena: Application to gene expression proles. In Proceedings of the National Academy of Science, 105, 4083-4087. https://doi.org/10.1073/pnas.0708598105
  6. Le, C. M. and Levina, E. (2015). Estimating the number of components in networks by spectral methods, arXiv 1507.00827.
  7. LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning, Nature, 521, 436-444. https://doi.org/10.1038/nature14539
  8. Ng, A., Jordan, M., and Weiss, Y. (2002). On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems, 849?856, MIT Press.
  9. Nie, F., Zeng, Z., Tsang, I. W., Xu, D., and Zhang, C. (2011). Spectral embedded clustering: A framework for in-sample and out-of-sample spectral clustering, IEEE Transactions on Neural Networks, 22, 1796-1808. https://doi.org/10.1109/TNN.2011.2162000
  10. Tibshirani, R., Walther, G., and Hastie, T. (2001). Estimating the number of clusters in a dataset via the gap statistic. Journal of the Royal Statistical Society Series B, 63, 411-423. https://doi.org/10.1111/1467-9868.00293
  11. von Luxburg, U. (2007). A tutorial on spectral clustering, Statistics and Computing, 17, 395-416. https://doi.org/10.1007/s11222-007-9033-z
  12. Wang, Q., Qin, Z., Nie, F., and Li, X. (2019). Spectral embedded adaptive neighbors clustering, IEEE Transactions on Neural Networks and Learning Systems, 30, 1265-1271. https://doi.org/10.1109/TNNLS.2018.2861209
  13. Xu, L., Neufeld, J., Larson, B., and Schuurmans, D. (2005). Maximum margin clustering, Advances in Neural Information Processing Systems, 1537-1544.
  14. Zhang, K., Tsang, I. W., and Kwok, J. T. (2009). Maximum margin clustering made practical, IEEE Transactions on Neural Networks, 20, 583?596. https://doi.org/10.1109/TNN.2008.2010620
  15. Zelnik-Manor, L. and Perona, P. (2005). Self-tuning spectral clustering, Advances in Neural Information Processing Systems, 1601-1608, MIT Press.
  16. Zhou, G. T., Lan, T., Vahdat, A., and Mori, G. (2013). Latent maximum margin clustering, Advances in Neural Information Processing Systems, 28-36.