• Title/Summary/Keyword: Optimal clustering

Search Result 367, Processing Time 0.023 seconds

Semidefinite Spectral Clustering (준정부호 스펙트럼의 군집화)

  • Kim, Jae-Hwan;Choi, Seung-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07a
    • /
    • pp.892-894
    • /
    • 2005
  • Graph partitioning provides an important tool for data clustering, but is an NP-hard combinatorial optimization problem. Spectral clustering where the clustering is performed by the eigen-decomposition of an affinity matrix [1,2]. This is a popular way of solving the graph partitioning problem. On the other hand, semidefinite relaxation, is an alternative way of relaxing combinatorial optimization. issuing to a convex optimization[4]. In this paper we present a semidefinite programming (SDP) approach to graph equi-partitioning for clustering and then we use eigen-decomposition to obtain an optimal partition set. Therefore, the method is referred to as semidefinite spectral clustering (SSC). Numerical experiments with several artificial and real data sets, demonstrate the useful behavior of our SSC. compared to existing spectral clustering methods.

  • PDF

An Improved Automated Spectral Clustering Algorithm

  • Xiaodan Lv
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.185-199
    • /
    • 2024
  • In this paper, an improved automated spectral clustering (IASC) algorithm is proposed to address the limitations of the traditional spectral clustering (TSC) algorithm, particularly its inability to automatically determine the number of clusters. Firstly, a cluster number evaluation factor based on the optimal clustering principle is proposed. By iterating through different k values, the value corresponding to the largest evaluation factor was selected as the first-rank number of clusters. Secondly, the IASC algorithm adopts a density-sensitive distance to measure the similarity between the sample points. This rendered a high similarity to the data distributed in the same high-density area. Thirdly, to improve clustering accuracy, the IASC algorithm uses the cosine angle classification method instead of K-means to classify the eigenvectors. Six algorithms-K-means, fuzzy C-means, TSC, EIGENGAP, DBSCAN, and density peak-were compared with the proposed algorithm on six datasets. The results show that the IASC algorithm not only automatically determines the number of clusters but also obtains better clustering accuracy on both synthetic and UCI datasets.

Global Optimization of Clusters in Gene Expression Data of DNA Microarrays by Deterministic Annealing

  • Lee, Kwon Moo;Chung, Tae Su;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.1 no.1
    • /
    • pp.20-24
    • /
    • 2003
  • The analysis of DNA microarry data is one of the most important things for functional genomics research. The matrix representation of microarray data and its successive 'optimal' incisional hyperplanes is a useful platform for developing optimization algorithms to determine the optimal partitioning of pairwise proximity matrix representing completely connected and weighted graph. We developed Deterministic Annealing (DA) approach to determine the successive optimal binary partitioning. DA algorithm demonstrated good performance with the ability to find the 'globally optimal' binary partitions. In addition, the objects that have not been clustered at small non­zero temperature, are considered to be very sensitive to even small randomness, and can be used to estimate the reliability of the clustering.

A Determination of an Optimal Clustering Method Based on Data Characteristics

  • Kim, Jeong-Hun;Yoo, Kwan-Hee;Nasridinov, Aziz
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.8
    • /
    • pp.305-314
    • /
    • 2017
  • Clustering is a method that collects data objects into groups based on their similary. Performance of the state-of-the-art clustering methods is different according to the data characteristics. There have been numerous studies that performed experiments to compare the accuracy of the state-of-the-art clustering methods by applying various kinds of datasets. A common problem of these studies is that they only consider clustering algorithms that yield the most accurate results for a particular dataset. They do not consider what factors affect the execution time of each clustering method and how they are affected. Nevertheless, execution time is an important factor in clustering performance if there is no significant difference in accuracy. In order to solve the problems of the existing research, through a series of experiments using various types of datasets, we compare the accuracy of four representative clustering methods. In addition, we perform practical clustering performance comparisons by deriving time complexity and identifying factors that influences to its performance.

Improvement of Self Organizing Maps using Gap Statistic and Probability Distribution

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.116-120
    • /
    • 2008
  • Clustering is a method for unsupervised learning. General clustering tools have been depended on statistical methods and machine learning algorithms. One of the popular clustering algorithms based on machine learning is the self organizing map(SOM). SOM is a neural networks model for clustering. SOM and extended SOM have been used in diverse classification and clustering fields such as data mining. But, SOM has had a problem determining optimal number of clusters. In this paper, we propose an improvement of SOM using gap statistic and probability distribution. The gap statistic was introduced to estimate the number of clusters in a dataset. We use gap statistic for settling the problem of SOM. Also, in our research, weights of feature nodes are updated by probability distribution. After complete updating according to prior and posterior distributions, the weights of SOM have probability distributions for optima clustering. To verify improved performance of our work, we make experiments compared with other learning algorithms using simulation data sets.

PI Controller Design for Permanent Magnet Synchronous Motor Drives Using Clustering Fuzzy Algorithm (콜러스터링 퍼지알고리즘을 이용한 영구자석 동기전동기 구동용 PI 제어기 설계)

  • Kwon, Chung-Jin;Han, Woo-Yong
    • Proceedings of the KIEE Conference
    • /
    • 2004.10a
    • /
    • pp.182-184
    • /
    • 2004
  • This paper presents a PI controller tuning method for high performance permanent magnet synchronous motor (PMSM) drives under load variations using clustering fuzzy algorithm. In many speed tracking control systems PI controller has been used due to its simple structure and easy of design. PI controller, however, suffers from the electrical machine parameter variations and disturbances. In order to improve the tracking control performance under load variations, the PI controller parameters are modified during operation by clustering fuzzy method. This method based on optimal fuzzy logic system has simple structure and computational simplicity. It needs only sample data which is obtained by optimal controller off-line. As the sample data implemented in the adaptive fuzzy system can be modified or extended, a flexible control system can be obtained Simulation results show the usefulness of the proposed controller.

  • PDF

A Study on the Development of Clustering Algorithm Using the Entropic Measure of Cohesion (앤트로피 응집력척도를 활용한 군락화기법개발에 관한 연구)

  • 정현태;최인수
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.14 no.1
    • /
    • pp.36-50
    • /
    • 1989
  • The purpose of this study is to design effective working systems which adapt to changes in human needs by developing an algorithm which forms workers into optimal groups using the meausre of cohesion. Three major results can be derived from the study. Firstly, the algorithm developed here provides an optimal point at which to stop clustering. Secondely, the entropic measure of cohesion having an internal probabilistic structure is superior with respect to any other methods proposed before as far as the design of workgroup is concerned. Thirdly, the r $C_{n}$ clustering algorithm is better than the dichotonomic one.e.

  • PDF

Optimal Base Station Clustering for a Mobile Communication Network Design

  • Hong, Jung-Man;Lee, Jong-Hyup;Lee, Soong-Hee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.5
    • /
    • pp.1069-1084
    • /
    • 2011
  • This paper considers an optimal base station clustering problem for designing a mobile (wireless) communication network. For a given network with a set of nodes (base stations), the problem is to optimally partition the set of nodes into subsets (each called a cluster) such that the associated inter-cluster traffic is minimized under certain topological constraints and cluster capacity constraints. In the problem analysis, the problem is formulated as an integer programming problem. The integer programming problem is then transformed into a binary integer programming problem, for which the associated linear programming relaxation is solved in a column generation approach assisted by a branch-and-bound procedure. For the column generation, both a heuristic algorithm and a valid inequality approach are exploited. Various numerical examples are solved to evaluate the effectiveness of the LP (Linear Programming) based branch-and-bound algorithm.

Clustering Optimal Design in Wireless Sensor Network using Ant Colony Optimization (개미군 최적화 방법을 적용한 무선 센서 네트워크에서의 클러스터링 최적 설계)

  • Kim, Sung-Soo;Choi, Seung-Hyeon
    • Korean Management Science Review
    • /
    • v.26 no.3
    • /
    • pp.55-65
    • /
    • 2009
  • The objective of this paper is to propose an ant colony optimization (ACO) for clustering design in wireless sensor network problem. This proposed ACO approach is designed to deal with the dynamics of the sensor nodes which can be adaptable to topological changes to any network graph in a time. Long communication distances between sensors and a sink in a sensor network can greatly consume the energy of sensors and reduce the lifetime of a network. We can greatly minimize the total communication distance while minimizing the number of cluster heads using proposed ACO. Simulation results show that our proposed method is very efficient to find the best solutions comparing to the optimal solution using CPLEX in 100, 200, and 400 node sensor networks.

Clustering Parts Based on the Design and Manufacturing Similarities Using a Genetic Algorithm

  • Lee, Sung-Youl
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.16 no.4
    • /
    • pp.119-125
    • /
    • 2011
  • The part family (PF) formation in a cellular manufacturing has been a key issue for the successful implementation of Group Technology (GT). Basically, a part has two different attributes; i.e., design and manufacturing. The respective similarity in both attributes is often conflicting each other. However, the two attributes should be taken into account appropriately in order for the PF to maximize the benefits of the GT implementation. This paper proposes a clustering algorithm which considers the two attributes simultaneously based on pareto optimal theory. The similarity in each attribute can be represented as two individual objective functions. Then, the resulting two objective functions are properly combined into a pareto fitness function which assigns a single fitness value to each solution based on the two objective functions. A GA is used to find the pareto optimal set of solutions based on the fitness function. A set of hypothetical parts are grouped using the proposed system. The results show that the proposed system is very promising in clustering with multiple objectives.