DOI QR코드

DOI QR Code

계층 발생 프레임워크를 이용한 군집 계층 시각화

Visualizing Cluster Hierarchy Using Hierarchy Generation Framework

  • 신동화 (서울대학교 컴퓨터공학부) ;
  • 이세희 (서울대학교 컴퓨터공학부) ;
  • 서진욱 (서울대학교 컴퓨터공학부)
  • 투고 : 2015.03.25
  • 심사 : 2015.04.16
  • 발행 : 2015.06.15

초록

군집화 알고리즘은 그 종류에 따라 만들어낼 수 있는 군집의 종류와 보여줄 수 있는 정보의 수준이 차이가 난다. 밀도기반 군집화 알고리즘은 데이터 분포 상의 임의의 모양을 가진 군집을 잘 잡아내지만 보여줄 수 있는 계층정보가 매우 적거나 없는 수준이고, 반면 계층적 군집화 알고리즘은 자세한 계층 정보를 보여주지만 구 모양의 군집 외에는 잘 잡아내지 못한다. 이 논문에서는 이러한 두 군집화 방식의 대표적 알고리즘인 OPTICS와 응집 계층 군집화 알고리즘의 장점만을 취하는 계층 발생 프레임워크를 제시하고 이와 더불어 효과적 데이터 분석을 위한 여러 시각화, 상호작용 기법을 지원하는 시각적 분석 애플리케이션을 제공한다.

There are many types of clustering algorithms such as centroid, hierarchical, or density-based methods. Each algorithm has unique data grouping principles, which creates different varieties of clusters. Ordering Points To Identify the Clustering Structure (OPTICS) is a well-known density-based algorithm to analyze arbitrary shaped and varying density clusters, but the obtained clusters only correlate loosely. Hierarchical agglomerative clustering (HAC) reveals a hierarchical structure of clusters, but is unable to clearly find non-convex shaped clusters. In this paper, we provide a novel hierarchy generation framework and application which can aid users by combining the advantages of the two clustering methods.

키워드

과제정보

연구 과제 주관 기관 : National Research Foundation of Korea(NRF)

참고문헌

  1. N. Soni, and A. Ganatra, "Categorization of Several Clustering Algorithms from Different Persepctive: A Review," International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 2, No. 8, pp. 63-68, Aug., 2012.
  2. M. Ankerst, M. M. Breunig, H. Kriegel, and J. Sander, "OPTICS: Ordering Points To Identify the Clustering Structure," Proc. of the 26th ACM Special Interest Group on Management of Data, pp. 49-60, 1999.
  3. J. Sander, X. Qin, Z. Lu, N. Niu, and A. Kovarsky, "Automatic Extraction of Clusters from Hierarchical Clustering Representations," Proc. of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 75-87, 2003.
  4. R. J. G. B. Campello, D. Moulavi, and J. Sander, "Density-Based Clustering Based on Hierarchical Density Estimates," Proc. of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 160-172, 2013.
  5. M. F. Balcan, and P. Gupta, "Robust Hierarchical Clustering," Proc. of the 23rd Annual Conference on Learning Theory, pp. 282-294, 2010.
  6. B. Shneiderman, "Dynamic Queries for Visual Information Seeking," IEEE Software, Vol. 11, No. 6, pp. 70-77, Nov. 1994. https://doi.org/10.1109/52.329404
  7. J. Chen, A. M. MacEachren, and D. J. Peuquet, "Constructing Overview + Detail Dendrogram-Matrix Views," IEEE Trans. on Visualization and Computer Graphics, Vol. 15, No. 6, pp. 889-896, Nov. 2009. https://doi.org/10.1109/TVCG.2009.130
  8. A. Buja, J. A. McDonald, J. Michalak, and W. Stuetzle, "Interactive Data Visualization using Focusing and Linking," Proc. of the 2nd IEEE Conference on Visualization, pp. 156-163, 1991.
  9. H. Chang, and D.-Y. Yeung, "Robust Path-based Spectral Clustering," Pattern Recognition, Vol. 41, No. 1, pp. 191-203, Jan. 2008. https://doi.org/10.1016/j.patcog.2007.04.010
  10. J. Seo, and B. Shneiderman, "Interactively Exploring Hierarchical Clustering Results," IEEE Computer, Vol. 35, No. 7, pp. 80-86, Jul. 2002.
  11. E. Achtert, C. Bohm, and P. Kroger, "DeLi-Clu: Boosting Robustness, Completeness, Usability, and Efficiency of Hierarchical Clustering by a Closest Pair Ranking," Proc. of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 119-128, 2006.
  12. E. Achtert, H. Kriegel, E. Schubert, and A. Zimek, "A Interactive Data Mining with 3D-Parallel- Coordinate-Trees," Proc. of the 40th ACM Special Interest Group on Management of Data, pp. 1009-1012, 2013.