• Title/Summary/Keyword: 계층적 클러스터링

Search Result 244, Processing Time 0.021 seconds

Ant Colony Hierarchical Cluster Analysis (개미 군락 시스템을 이용한 계층적 클러스터 분석)

  • Kang, Mun-Su;Choi, Young-Sik
    • Journal of Internet Computing and Services
    • /
    • v.15 no.5
    • /
    • pp.95-105
    • /
    • 2014
  • In this paper, we present a novel ant-based hierarchical clustering algorithm, where ants repeatedly hop from one node to another over a weighted directed graph of k-nearest neighborhood obtained from a given dataset. We introduce a notion of node pheromone, which is the summation of amount of pheromone on incoming arcs to a node. The node pheromone can be regarded as a relative density measure in a local region. After a finite number of ants' hopping, we remove nodes with a small amount of node pheromone from the directed graph, and obtain a group of strongly connected components as clusters. We iteratively do this removing process from a low value of threshold to a high value, yielding a hierarchy of clusters. We demonstrate the performance of the proposed algorithm with synthetic and real data sets, comparing with traditional clustering methods. Experimental results show the superiority of the proposed method to the traditional methods.

Property-based Hierarchical Clustering of Peers using Mobile Agent for Unstructured P2P Systems (비구조화 P2P 시스템에서 이동에이전트를 이용한 Peer의 속성기반 계층적 클러스터링)

  • Salvo, MichaelAngelG.;Mateo, RomeoMarkA.;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.189-198
    • /
    • 2009
  • Unstructured peer-to-peer systems are most commonly used in today's internet. But file placement is random in these systems and no correlation exists between peers and their contents. There is no guarantee that flooding queries will find the desired data. In this paper, we propose to cluster nodes in unstructured P2P systems using the agglomerative hierarchical clustering algorithm to improve the search method. We compared the delay time of clustering the nodes between our proposed algorithm and the k-means clustering algorithm. We also simulated the delay time of locating data in a network topology and recorded the overhead of the system using our proposed algorithm, k-means clustering, and without clustering. Simulation results show that the delay time of our proposed algorithm is shorter compared to other methods and resource overhead is also reduced.

  • PDF

Double Clustering of Gene Expression Data Based on the Information Bottleneck Method (정보병목기법에 기반한 유전자 발현 데이터의 이중 클러스터링)

  • 김병희;황규백;장정호;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.362-364
    • /
    • 2003
  • 기능 유전체학에서 클러스터링 기법은 고차원의 마이크로 어레이 데이터 분석을 위한 주된 도구 중의 하나이다. 본 논문에서는 정보병목(information bottleneck)기법 기반의 이중 클러스터링에 의한, 유전자 발현 데이터의 계층적 병합방식 클러스터링 기법을 제안한다. 정보병목기법은, 두 랜덤변수의 결합확률분포가 주어진 경우 두 변수의 상호 정보량을 최대한 보존하면서 한 변수를 압축하는 기법이며, 두 변수를 차례로 압축하는 것이 이중 클러스터링이다. 실제 마이크로 어레이 데이터인 NC160 데이터(암세포 내 유전자 발현 데이터)에 대한 실험에서, 먼저 유전자를 그 발현패턴에 따라 클러스터링 한 후 이를 이용하여 표본들을 클러스터링하고 그 성능을 다각도로 분석하였다. 상호 정보량과 유전자 및 표본 클러스터 수와 엔트로피 척도에 의한 성능을 검토해 본 결과, 표본이 추출 조직에 따라 구분 가능할 것이라는 가정을 검증할 수 있었으며, 적절한 클러스터의 수를 결정할 수 있는 임계점의 기준을 설정할 수 있었다.

  • PDF

Comparison of Initial Seeds Methods for K-Means Clustering (K-Means 클러스터링에서 초기 중심 선정 방법 비교)

  • Lee, Shinwon
    • Journal of Internet Computing and Services
    • /
    • v.13 no.6
    • /
    • pp.1-8
    • /
    • 2012
  • Clustering method is divided into hierarchical clustering, partitioning clustering, and more. K-Means algorithm is one of partitioning clustering and is adequate to cluster so many documents rapidly and easily. It has disadvantage that the random initial centers cause different result. So, the better choice is to place them as far away as possible from each other. We propose a new method of selecting initial centers in K-Means clustering. This method uses triangle height for initial centers of clusters. After that, the centers are distributed evenly and that result is more accurate than initial cluster centers selected random. It is time-consuming, but can reduce total clustering time by minimizing the number of allocation and recalculation. We can reduce the time spent on total clustering. Compared with the standard algorithm, average consuming time is reduced 38.4%.

A Study on the Asia Container Ports Clustering Using Hierarchical Clustering(Single, Complete, Average, Centroid Linkages) Methods with Empirical Verification of Clustering Using the Silhouette Method and the Second Stage(Type II) Cross-Efficiency Matrix Clustering Model (계층적 군집분석(최단, 최장, 평균, 중앙연결)방법에 의한 아시아 컨테이너 항만의 클러스터링 측정 및 실루엣방법과 2단계(Type II) 교차효율성 메트릭스 군집모형을 이용한 실증적 검증에 관한 연구)

  • Park, Ro-Kyung
    • Journal of Korea Port Economic Association
    • /
    • v.37 no.1
    • /
    • pp.31-70
    • /
    • 2021
  • The purpose of this paper is to measure the clustering change and analyze empirical results, and choose the clustering ports for Busan, Incheon, and Gwangyang ports by using Hierarchical clustering(single, complete, average, and centroid), Silhouette, and 2SCE[the Second Stage(Type II) cross-efficiency] matrix clustering models on Asian container ports over the period 2009-2018. The models have chosen number of cranes, depth, birth length, and total area as inputs and container TEU as output. The main empirical results are as follows. First, ranking order according to the efficiency increasing ratio during the 10 years analysis shows Silhouette(0.4052 up), Hierarchical clustering(0.3097 up), and 2SCE(0.1057 up). Second, according to empirical verification of the Silhouette and 2SCE models, 3 Korean ports should be clustered with ports like Busan Port[ Dubai, Hong Kong, and Tanjung Priok], and Incheon Port and Gwangyang Port are required to cluster with most ports. Third, in terms of the ASEAN, it would be good to cluster like Busan (Singapore), Incheon Port (Tanjung Priok, Tanjung Perak, Manila, Tanjung Pelpas, Leam Chanbang, and Bangkok), and Gwangyang Port(Tanjung Priok, Tanjung Perak, Port Kang, Tanjung Pelpas, Leam Chanbang, and Bangkok). Third, Wilcoxon's signed-ranks test of models shows that all P values are significant at an average level of 0.852. It means that the average efficiency figures and ranking orders of the models are matched each other. The policy implication is that port policy makers and port operation managers should select benchmarking ports by introducing the models used in this study into the clustering of ports, compare and analyze the port development and operation plans of their ports, and introduce and implement the parts which required benchmarking quickly.

An Efficient Clustering Scheme Considering Distance from a SINK for Wireless Sensor Networks (무선 센서 네트워크에서 싱크와의 거리를 고려한 효율적인 클러스터링 기법)

  • Kang, Tae-Wook;Jung, Il-Gyu;Han, Ki-Jun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.445-447
    • /
    • 2005
  • 무선 센서 네트워크는 제한된 에너지를 가지는 작은 센서 노드들로 구성된다. 한번 배치된 센서 노드들은 유지보수 및 새로운 에너지의 공급이 어렵다. 따라서 각 노드가 가지는 제한된 에너지를 얼마나 효율적으로 사용하느냐가 무선 센서 네트워크의 수명에 큰 영향을 미친다. 본 논문에서는 이러한 에너지 효율성 향상을 위해 연구된 LEACH(Low Energy Adaptive Clustering Hierarchy), LEACH-C(LEACH-Centralized), BCDCP(Base-station Controlled Dynamic Clustering Protocol)와 같은 클러스터링 기반의 계층적 라우팅 프로토콜들을 설명하고 그 문제점을 살펴본다. 그리고 그 문제점들을 해결하기 위한 방법으로 센서 필드의 노드와 싱크와의 거리를 고려한 새로운 클러스터링 기법을 제안한다. 제안하는 클러스터링 기법에서 각 노드는 클러스터를 형성할 때 기존 방식에 비해 적은 역할을 수행함으로써 자신의 에너지를 보존할 수 있다.

  • PDF

Comparison of Software Clustering using Split Based Tree Analysis (분기점 기반 트리 분석을 통한 소프트웨어 클러스터링 결과 비교)

  • Um, Jaechul;Lee, Chan-gun
    • Journal of Software Engineering Society
    • /
    • v.25 no.3
    • /
    • pp.59-62
    • /
    • 2012
  • We propose a novel metric for quantitatively comparing different clustered results generated from software clustering algorithms. A quantitative evaluation of software clustering helps understanding of architectural changes of software. The concept of split, which has been used for analysis of genetic characters in bio-informatics, is applied in the analysis of software architecture.

  • PDF

Mongrel : Global Placement with Hierarchical Partitioning (Mongrel : 계층적 분할 기법을 이용한 광역 배치)

  • 성영태;허성우
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.742-744
    • /
    • 2004
  • 본 논문에서는 표준 셀 배치기 Mongrel의 성능을 개선하기 위해 사용된 다양한 기법에 관해 살펴보고 top-down방식의 계층적 분할 기법을 이용한 광역 배치(Hierarchical Global Placement)를 제안한다. 계층적 분할 기법을 이용한 광역 배치는 RBLS(Relaxation Based Local Search) 기법과 더불어 Mongrel의 성능 개선에 결정적인 역할을 하고 있으며 분할 기법으로 hMETIS(클러스터링을 이용한 다단계 분할 기법)를 사용한다. 우리는 표준 벤치마크 회로를 이용한 실험을 통해 계층적 분할 기법을 이용한 광역 배치 기법이 안정적이면서 효율적인 배치 결과를 가져옴을 보인다.

  • PDF

Refining Initial Seeds using Max Average Distance for K-Means Clustering (K-Means 클러스터링 성능 향상을 위한 최대평균거리 기반 초기값 설정)

  • Lee, Shin-Won;Lee, Won-Hee
    • Journal of Internet Computing and Services
    • /
    • v.12 no.2
    • /
    • pp.103-111
    • /
    • 2011
  • Clustering methods is divided into hierarchical clustering, partitioning clustering, and more. If the amount of documents is huge, it takes too much time to cluster them in hierarchical clustering. In this paper we deal with K-Means algorithm that is one of partitioning clustering and is adequate to cluster so many documents rapidly and easily. We propose the new method of selecting initial seeds in K-Means algorithm. In this method, the initial seeds have been selected that are positioned as far away from each other as possible.

Hierarchical Clustering of Symbolic Objects based on Asymmetric Proximity (비대칭적 유사도 기반의 심볼릭 객체의 계층적 클러스터링)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.729-734
    • /
    • 2012
  • Clustering analysis has been widely used in numerous applications like pattern recognition, data analysis, intrusion detection, image processing, bioinformatics and so on. Much of previous work has been based on the numeric data only. However, symbolic data analysis has emerged to deal with variables that can have intervals, histograms, and even functions as values. In this paper, we propose a non symmetric proximity based clustering approach for symbolic objects. A method for clustering symbolic patterns based on the average similarity value(ASV) is explored. The results of the proposed clustering method differ from those of the existing methods and the results are very encouraging.