• 제목/요약/키워드: Clustering Method

검색결과 2,551건 처리시간 0.027초

최적에 가까운 군집화를 위한 이단계 방법 (A Two-Stage Method for Near-Optimal Clustering)

  • 윤복식
    • 한국경영과학회지
    • /
    • 제29권1호
    • /
    • pp.43-56
    • /
    • 2004
  • The purpose of clustering is to partition a set of objects into several clusters based on some appropriate similarity measure. In most cases, clustering is considered without any prior information on the number of clusters or the structure of the given data, which makes clustering is one example of very complicated combinatorial optimization problems. In this paper we propose a general-purpose clustering method that can determine the proper number of clusters as well as efficiently carry out clustering analysis for various types of data. The method is composed of two stages. In the first stage, two different hierarchical clustering methods are used to get a reasonably good clustering result, which is improved In the second stage by ASA(accelerated simulated annealing) algorithm equipped with specially designed perturbation schemes. Extensive experimental results are given to demonstrate the apparent usefulness of our ASA clustering method.

응집력 척도를 활용한 계층별-조결합군락화 기법의 개발 (Development of the Combinatorial Agglomerative Hierarchical Clustering Method Using the Measure of Cohesion)

  • 정현태;최인수
    • 품질경영학회지
    • /
    • 제18권1호
    • /
    • pp.48-54
    • /
    • 1990
  • The purpose of this study is to design effective working systems which adapt to change in human needs by developing an method which forms into optimal groups using the measure of cohesion. Two main results can be derived from the study as follows : First, the clustering method based on the entropic measure of cohesion is predominant with respect to any other methods proposed in designing the work groups, since this clustering criterion includes symmetrical relations of total work groups and the dissimilarity as well as the similarity relations of predicate value, the clustering method based on this criterion is suitable for designing the new work structure. Second, total work group is clustered as the workers who have the equal predicate value and then clustering results are produced through the combinatorial agglomerative hierarchical clustering method. This clustering method present more economic results than the method that clustering the total work group do.

  • PDF

An Agglomerative Hierarchical Variable-Clustering Method Based on a Correlation Matrix

  • Lee, Kwangjin
    • Communications for Statistical Applications and Methods
    • /
    • 제10권2호
    • /
    • pp.387-397
    • /
    • 2003
  • Generally, most of researches that need a variable-clustering process use an exploratory factor analysis technique or a divisive hierarchical variable-clustering method based on a correlation matrix. And some researchers apply a object-clustering method to a distance matrix transformed from a correlation matrix, though this approach is known to be improper. On this paper an agglomerative hierarchical variable-clustering method based on a correlation matrix itself is suggested. It is derived from a geometric concept by using variate-spaces and a characterizing variate.

다차원 개체를 위한 차이등급 clustering (The Difference Order Clustering for Multi-dimensional Entities)

  • 이철;강석호
    • 한국경영과학회지
    • /
    • 제14권1호
    • /
    • pp.108-118
    • /
    • 1989
  • The clustering problem for multi-dimensional entities is investigated. A heuristic method, which is named as Difference Order Clustering (DOC) is developed for the grouping of multi-dimensional entities DOC method has an advantage of identifying the bottle-neck entities. Comparisons among the proposed DOC method, modified rank order clustering (MODROC) method, and lexicographical rank order clustering using minimum spanning tree (lexico-MMSTROC) are illustrated by a part type selection problems.

  • PDF

Medoid Determination in Deterministic Annealing-based Pairwise Clustering

  • Lee, Kyung-Mi;Lee, Keon-Myung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제11권3호
    • /
    • pp.178-183
    • /
    • 2011
  • The deterministic annealing-based clustering algorithm is an EM-based algorithm which behaves like simulated annealing method, yet less sensitive to the initialization of parameters. Pairwise clustering is a kind of clustering technique to perform clustering with inter-entity distance information but not enforcing to have detailed attribute information. The pairwise deterministic annealing-based clustering algorithm repeatedly alternates the steps of estimation of mean-fields and the update of membership degrees of data objects to clusters until termination condition holds. Lacking of attribute value information, pairwise clustering algorithms do not explicitly determine the centroids or medoids of clusters in the course of clustering process or at the end of the process. This paper proposes a method to identify the medoids as the centers of formed clusters for the pairwise deterministic annealing-based clustering algorithm. Experimental results show that the proposed method locate meaningful medoids.

Spectral clustering based on the local similarity measure of shared neighbors

  • Cao, Zongqi;Chen, Hongjia;Wang, Xiang
    • ETRI Journal
    • /
    • 제44권5호
    • /
    • pp.769-779
    • /
    • 2022
  • Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

Magnetoencephalography Interictal Spike Clustering in Relation with Surgical Outcome of Cortical Dysplasia

  • Jeong, Woorim;Chung, Chun Kee;Kim, June Sic
    • Journal of Korean Neurosurgical Society
    • /
    • 제52권5호
    • /
    • pp.466-471
    • /
    • 2012
  • Objective : The aim of this study was to devise an objective clustering method for magnetoencephalography (MEG) interictal spike sources, and to identify the prognostic value of the new clustering method in adult epilepsy patients with cortical dysplasia (CD). Methods : We retrospectively analyzed 25 adult patients with histologically proven CD, who underwent MEG examination and surgical resection for intractable epilepsy. The mean postoperative follow-up period was 3.1 years. A hierarchical clustering method was adopted for MEG interictal spike source clustering. Clustered sources were then tested for their prognostic value toward surgical outcome. Results : Postoperative seizure outcome was Engel class I in 6 (24%), class II in 3 (12%), class III in 12 (48%), and class IV in 4 (16%) patients. With respect to MEG spike clustering, 12 of 25 (48%) patients showed 1 cluster, 2 (8%) showed 2 or more clusters within the same lobe, 10 (40%) showed 2 or more clusters in a different lobe, and 1 (4%) patient had only scattered spikes with no clustering. Patients who showed focal clustering achieved better surgical outcome than distributed cases (p=0.017). Conclusion : This is the first study that introduces an objective method to classify the distribution of MEG interictal spike sources. By using a hierarchical clustering method, we found that the presence of focal clustered spikes predicts a better postoperative outcome in epilepsy patients with CD.

점진적 개념학습의 클러스터 응집도 개선 (The Study on Improvement of Cohesion of Clustering in Incremental Concept Learning)

  • 백혜정;박영택
    • 정보처리학회논문지B
    • /
    • 제10B권3호
    • /
    • pp.297-304
    • /
    • 2003
  • 요즘, 인터넷 등장 이후 폭발적으로 증대되는 웹 정보를 효율적으로 사용하기 위한 시스템들이 요구되고 있다. 이러한 요구를 해결하기 위해 개발된 시스템들은 서비스 정보의 질을 향상시키기 위하여 클러스터링 기법을 이용하고 있다. 클러스터링은 무질서한 데이터들의 상호 연관관계를 정의하고 이를 통하여 보다 체계적으로 데이터를 군집화하는 것이다. 클러스터링을 이용한 시스템은 비슷한 내용을 묶어 사용자에게 제공함으로, 사용자는 보다 효율적으로 정보를 파악할 수 있다. 그래서 이전 연구에서 대량의 데이터를 효율적으로 클러스터링 하기 위하여 통합 클러스터링 방식을 제안하였다. 이 방식은 COBWEB 알고리즘을 이용하여 초기 클러스터를 생성한 후 Etzioni 알고리즘을 이용하여 클러스터링을 생성하는 방식이다. 본 논문은 이러한 기존의 통합 클러스터링 방식의 정확성과 효율성을 높이기 위하여, 다음 두 가지 방식을 제안한다. 첫째, 클러스터할 데이터의 속성의 가중치클 고려한 클러스터링 방식을 제안한다. 둘째, 기존의 클러스터링 방식의 효율성을 지원하기 위하여, 초기 클러스터를 생성하는 평가 함수를 재정의한다. 본 논문에서 제안하는 클러스터링 방식은 방대한 양의 데이터를 효율적으로 처리 할 수 있으며 데이터의 입력 순서의 의존도를 줄여, 데이터를 효과적으로 클러스터, 양질의 사용자 프로파일 구축에 도움을 주게 된다.

Double monothetic clustering for histogram-valued data

  • Kim, Jaejik;Billard, L.
    • Communications for Statistical Applications and Methods
    • /
    • 제25권3호
    • /
    • pp.263-274
    • /
    • 2018
  • One of the common issues in large dataset analyses is to detect and construct homogeneous groups of objects in those datasets. This is typically done by some form of clustering technique. In this study, we present a divisive hierarchical clustering method for two monothetic characteristics of histogram data. Unlike classical data points, a histogram has internal variation of itself as well as location information. However, to find the optimal bipartition, existing divisive monothetic clustering methods for histogram data consider only location information as a monothetic characteristic and they cannot distinguish histograms with the same location but different internal variations. Thus, a divisive clustering method considering both location and internal variation of histograms is proposed in this study. The method has an advantage in interpreting clustering outcomes by providing binary questions for each split. The proposed clustering method is verified through a simulation study and applied to a large U.S. house property value dataset.

분류나무를 활용한 군집분석의 입력특성 선택: 신용카드 고객세분화 사례 (Classification Tree-Based Feature-Selective Clustering Analysis: Case of Credit Card Customer Segmentation)

  • 윤한성
    • 디지털산업정보학회논문지
    • /
    • 제19권4호
    • /
    • pp.1-11
    • /
    • 2023
  • Clustering analysis is used in various fields including customer segmentation and clustering methods such as k-means are actively applied in the credit card customer segmentation. In this paper, we summarized the input features selection method of k-means clustering for the case of the credit card customer segmentation problem, and evaluated its feasibility through the analysis results. By using the label values of k-means clustering results as target features of a decision tree classification, we composed a method for prioritizing input features using the information gain of the branch. It is not easy to determine effectiveness with the clustering effectiveness index, but in the case of the CH index, cluster effectiveness is improved evidently in the method presented in this paper compared to the case of randomly determining priorities. The suggested method can be used for effectiveness of actively used clustering analysis including k-means method.