• Title/Summary/Keyword: Hierarchical clustering analysis

Search Result 250, Processing Time 0.021 seconds

Development of Clustering Algorithm and Tool for DNA Microarray Data (DNA 마이크로어레이 데이타의 클러스터링 알고리즘 및 도구 개발)

  • 여상수;김성권
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.10
    • /
    • pp.544-555
    • /
    • 2003
  • Since the result data from DNA microarray experiments contain a lot of gene expression information, adequate analysis methods are required. Hierarchical clustering is widely used for analysis of gene expression profiles. In this paper, we study leaf-ordering, which is a post-processing for the dendrograms output by hierarchical clusterings to improve the efficiency of DNA microarray data analysis. At first, we analyze existing leaf-ordering algorithms and then present new approaches for leaf-ordering. And we introduce a software HCLO(Hierarchical Clustering & Leaf-Ordering Tool) that is our implementation of hierarchical clustering, some of existing leaf-ordering algorithms and those presented in this paper.

An Agglomerative Hierarchical Variable-Clustering Method Based on a Correlation Matrix

  • Lee, Kwangjin
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.387-397
    • /
    • 2003
  • Generally, most of researches that need a variable-clustering process use an exploratory factor analysis technique or a divisive hierarchical variable-clustering method based on a correlation matrix. And some researchers apply a object-clustering method to a distance matrix transformed from a correlation matrix, though this approach is known to be improper. On this paper an agglomerative hierarchical variable-clustering method based on a correlation matrix itself is suggested. It is derived from a geometric concept by using variate-spaces and a characterizing variate.

Agglomerative Hierarchical Clustering Analysis with Deep Convolutional Autoencoders (합성곱 오토인코더 기반의 응집형 계층적 군집 분석)

  • Park, Nojin;Ko, Hanseok
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.1
    • /
    • pp.1-7
    • /
    • 2020
  • Clustering methods essentially take a two-step approach; extracting feature vectors for dimensionality reduction and then employing clustering algorithm on the extracted feature vectors. However, for clustering images, the traditional clustering methods such as stacked auto-encoder based k-means are not effective since they tend to ignore the local information. In this paper, we propose a method first to effectively reduce data dimensionality using convolutional auto-encoder to capture and reflect the local information and then to accurately cluster similar data samples by using a hierarchical clustering approach. The experimental results confirm that the clustering results are improved by using the proposed model in terms of clustering accuracy and normalized mutual information.

Microarray data analysis using relative hierarchical clustering (상대적 계층적 군집 방법을 이용한 마이크로어레이 자료의 군집분석)

  • Woo, Sook Young;Lee, Jae Won;Jhun, Myoungshic
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.999-1009
    • /
    • 2014
  • Hierarchical clustering analysis helps easily exploring massive microarray data and understanding biological phenomena with dendrogram. But, because hierarchical clustering algorithms only consider the absolute similarity, it is difficult to illustrate a relative dissimilarity, which consider not only the distance between a pair of clusters, but also how distant are they from the rest of the clusters. In this study, we introduced the relative hierarchical clustering method proposed by Mollineda and Vidal (2000) and compared hierarchical clustering method and relative hierarchical method using the simulated data and the real data in the various situations. The evaluation of the quality of two hierarchical methods was performed using percentage of incorrectly grouped points (PIGP), homogeneity and separation.

Classification of network packets using hierarchical clustering (Hierarchical Clustering을 이용한 네트워크 패킷의 분류)

  • Yeo, Insung;Hai, Quan Tran;Hwang, Seong Oun
    • Journal of Internet of Things and Convergence
    • /
    • v.3 no.1
    • /
    • pp.9-11
    • /
    • 2017
  • Recently, with the widespread use of the Internet and mobile devices, the number of attacks by hackers using the network is increasing. When connecting a network, packets are exchanged and communicated, which includes various information. We analyze the information of these packets using hierarchical clustering analysis and classify normal and abnormal packets to detect attacks. With this analysis method, it will be possible to detect attacks by analyzing new packets.

Integrating physics-based fragility for hierarchical spectral clustering for resilience assessment of power distribution systems under extreme winds

  • Jintao Zhang;Wei Zhang;William Hughes;Amvrossios C. Bagtzoglou
    • Wind and Structures
    • /
    • v.39 no.1
    • /
    • pp.1-14
    • /
    • 2024
  • Widespread damages from extreme winds have attracted lots of attentions of the resilience assessment of power distribution systems. With many related environmental parameters as well as numerous power infrastructure components, such as poles and wires, the increased challenge of power asset management before, during and after extreme events have to be addressed to prevent possible cascading failures in the power distribution system. Many extreme winds from weather events, such as hurricanes, generate widespread damages in multiple areas such as the economy, social security, and infrastructure management. The livelihoods of residents in the impaired areas are devastated largely due to the paucity of vital utilities, such as electricity. To address the challenge of power grid asset management, power system clustering is needed to partition a complex power system into several stable clusters to prevent the cascading failure from happening. Traditionally, system clustering uses the Binary Decision Diagram (BDD) to derive the clustering result, which is time-consuming and inefficient. Meanwhile, the previous studies considering the weather hazards did not include any detailed weather-related meteorologic parameters which is not appropriate as the heterogeneity of the parameters could largely affect the system performance. Therefore, a fragility-based network hierarchical spectral clustering method is proposed. In the present paper, the fragility curve and surfaces for a power distribution subsystem are obtained first. The fragility of the subsystem under typical failure mechanisms is calculated as a function of wind speed and pole characteristic dimension (diameter or span length). Secondly, the proposed fragility-based hierarchical spectral clustering method (F-HSC) integrates the physics-based fragility analysis into Hierarchical Spectral Clustering (HSC) technique from graph theory to achieve the clustering result for the power distribution system under extreme weather events. From the results of vulnerability analysis, it could be seen that the system performance after clustering is better than before clustering. With the F-HSC method, the impact of the extreme weather events could be considered with topology to cluster different power distribution systems to prevent the system from experiencing power blackouts.

Results of Discriminant Analysis with Respect to Cluster Analyses Under Dimensional Reduction

  • Chae, Seong-San
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.543-553
    • /
    • 2002
  • Principal component analysis is applied to reduce p-dimensions into q-dimensions ( $q {\leq} p$). Any partition of a collection of data points with p and q variables generated by the application of six hierarchical clustering methods is re-classified by discriminant analysis. From the application of discriminant analysis through each hierarchical clustering method, correct classification ratios are obtained. The results illustrate which method is more reasonable in exploratory data analysis.

On the clustering of huge categorical data

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.6
    • /
    • pp.1353-1359
    • /
    • 2010
  • Basic objective in cluster analysis is to discover natural groupings of items. In general, clustering is conducted based on some similarity (or dissimilarity) matrix or the original input data. Various measures of similarities between objects are developed. In this paper, we consider a clustering of huge categorical real data set which shows the aspects of time-location-activity of Korean people. Some useful similarity measure for the data set, are developed and adopted for the categorical variables. Hierarchical and nonhierarchical clustering method are applied for the considered data set which is huge and consists of many categorical variables.

Two Phase Hierarchical Clustering Algorithm for Group Formation in Data Mining (데이터 마이닝에서 그룹 세분화를 위한 2단계 계층적 글러스터링 알고리듬)

  • 황인수
    • Korean Management Science Review
    • /
    • v.19 no.1
    • /
    • pp.189-196
    • /
    • 2002
  • Data clustering is often one of the first steps in data mining analysis. It Identifies groups of related objects that can be used as a starling point for exploring further relationships. This technique supports the development of population segmentation models, such as demographic-based customer segmentation. This paper Purpose to present the development of two phase hierarchical clustering algorithm for group formation. Applications of the algorithm for product-customer group formation in customer relationahip management are also discussed. As a result of computer simulations, suggested algorithm outperforms single link method and k-means clustering.

Exploration of Hierarchical Techniques for Clustering Korean Author Names (한글 저자명 군집화를 위한 계층적 기법 비교)

  • Kang, In-Su
    • Journal of Information Management
    • /
    • v.40 no.2
    • /
    • pp.95-115
    • /
    • 2009
  • Author resolution is to disambiguate same-name author occurrences into real individuals. For this, pair-wise author similarities are computed for author name entities, and then clustering is performed. So far, many studies have employed hierarchical clustering techniques for author disambiguation. However, various hierarchical clustering methods have not been sufficiently investigated. This study covers an empirical evaluation and analysis of hierarchical clustering applied to Korean author resolution, using multiple distance functions such as Dice coefficient, Cosine similarity, Euclidean distance, Jaccard coefficient, Pearson correlation coefficient.