DOI QR코드

DOI QR Code

클러스터 평가 외부기준 척도 $F_n$-Measure

$F_n$-Measure : An External Cluster Evaluation Measure

  • 김경택 (한남대학교 산업경영공학과)
  • Kim, Kyeongtaek (Department of Industrial and Management Engineering, Hannam University)
  • 투고 : 2012.11.15
  • 심사 : 2012.12.13
  • 발행 : 2012.12.31

초록

F-Measure is one of the external measures for evaluating the validity of clustering results. Though it has clear advantages over other widely used external measures such as Purity and Entropy, F-Measure has inherently been less sensitive than other validity measures. This insensitivity owes to the definition of F-Measure that counts only most influential portions. In this research, we present $F_n$-Measure, an external cluster evaluation measure based on F-Measure. $F_n$-Measure is so sensitive that it can detect their difference in the cases that F-Measure cannot detect the difference in clustering results. We compare $F_n$-Measure to F-Measure for a few clustering results and show which measure draws better result based upon homogeneity and completeness.

키워드

과제정보

연구 과제 주관 기관 : Hannam University

참고문헌

  1. Xu, L., Mo, H., and Wang, K., Immune Algorithm for Supervised Clustering. Proceedings of the 5th IEEE International Conference on Cognitive Informatics, 2006, p 953-958.
  2. Xu, R. and Wunsch, D. II, Survey of Clustering Algorithms. IEEE Transactions on neural Networks, 2005, Vol. 16, No. 3, p 645-678. https://doi.org/10.1109/TNN.2005.845141
  3. Halkidi, M., Batistakis, Y., and Vazirgiannis, M., Cluster Validity Methods : Part I, SIGMOD Record, 2002, Vol. 31, No. 2, p 40-45. https://doi.org/10.1145/565117.565124
  4. Zhao, Y. and Karypis, G., Criterion functions for document clustering : Experiments and Analysis. Technical Report TR 01-40, Dept. of Computer Science, U. of Minnesota, 2001.
  5. Rosenberg, A. and Hirschberg, J., V-Measure : A Conditional Entropy-based External Cluster Evaluation Measure, Proceedings of the 2007 Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 2007, p 410-420.
  6. Melia, M., Comparing Clustering-an Information Based Distance. J. of Multivariate Analysis, 2007, Vol. 98, p 873-895. https://doi.org/10.1016/j.jmva.2006.11.013
  7. Berry, M. and Linoff, G., Data Mining Techniques for Marketing, Sales and Customer Support, John Wiley and Sons, 1996.
  8. Reichart, R. and Rappoport, A., The NVI Clustering Evaluation Measure. Proceedings of the Thirteenth Conference on Computational Natural Language Learning, 2009, p 165-173.
  9. Amigo, E., Gonzalo, J., and Artiles, J., A Comparison of Extrinsic Clustering Evaluation Metrics based on Formal Constraints. Information Retrieval, Aug 2009, Vol. 12, No. 4, p 461-486. https://doi.org/10.1007/s10791-008-9066-8
  10. Wu, J., Chen, J., Xiong, H., and Xie, M., External Validation measures for K-means Clustering : A Data Distribution perspective. Expert Systems with Applications, 2009, Vol. 36, p 6050-6061. https://doi.org/10.1016/j.eswa.2008.06.093
  11. Larson, B. and Aone, C., Fast and Effective Text Mining Using Linear Time Document Clustering. Proceedings of the Conference on Knowledge Discovery and Data Mining, 1999, p 16-22.

피인용 문헌

  1. 소셜 빅데이터 마이닝 기반 이슈 분석보고서 자동 생성 vol.3, pp.12, 2012, https://doi.org/10.3745/ktsde.2014.3.12.553