• Title/Summary/Keyword: Clustering Problem

Search Result 708, Processing Time 0.027 seconds

Improvement of Self Organizing Maps using Gap Statistic and Probability Distribution

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.116-120
    • /
    • 2008
  • Clustering is a method for unsupervised learning. General clustering tools have been depended on statistical methods and machine learning algorithms. One of the popular clustering algorithms based on machine learning is the self organizing map(SOM). SOM is a neural networks model for clustering. SOM and extended SOM have been used in diverse classification and clustering fields such as data mining. But, SOM has had a problem determining optimal number of clusters. In this paper, we propose an improvement of SOM using gap statistic and probability distribution. The gap statistic was introduced to estimate the number of clusters in a dataset. We use gap statistic for settling the problem of SOM. Also, in our research, weights of feature nodes are updated by probability distribution. After complete updating according to prior and posterior distributions, the weights of SOM have probability distributions for optima clustering. To verify improved performance of our work, we make experiments compared with other learning algorithms using simulation data sets.

Hyper-ellipsoidal clustering algorithm using Linear Matrix Inequality (선형행렬 부등식을 이용한 타원형 클러스터링 알고리즘)

  • Lee, Han-Sung;Park, Joo-Young;Park, Dai-Hee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.4
    • /
    • pp.300-305
    • /
    • 2002
  • In this paper, we use the modified gaussian kernel function as clustering distance measure and recast the given hyper-ellipsoidal clustering problem as the optimization problem that minimizes the volume of hyper-ellipsoidal clusters, respectively and solve this using EVP (eigen value problem) that is one of the LMI (linear matrix inequality) techniques.

Application of Genetic and Local Optimization Algorithms for Object Clustering Problem with Similarity Coefficients (유사성 계수를 이용한 군집화 문제에서 유전자와 국부 최적화 알고리듬의 적용)

  • Yim, Dong-Soon;Oh, Hyun-Seung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.29 no.1
    • /
    • pp.90-99
    • /
    • 2003
  • Object clustering, which makes classification for a set of objects into a number of groups such that objects included in a group have similar characteristic and objects in different groups have dissimilar characteristic each other, has been exploited in diverse area such as information retrieval, data mining, group technology, etc. In this study, an object-clustering problem with similarity coefficients between objects is considered. At first, an evaluation function for the optimization problem is defined. Then, a genetic algorithm and local optimization technique based on heuristic method are proposed and used in order to obtain near optimal solutions. Solutions from the genetic algorithm are improved by local optimization techniques based on object relocation and cluster merging. Throughout extensive experiments, the validity and effectiveness of the proposed algorithms are tested.

Empirical Comparisons of Clustering Algorithms using Silhouette Information

  • Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.1
    • /
    • pp.31-36
    • /
    • 2010
  • Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.

The Difference Order Clustering for Multi-dimensional Entities (다차원 개체를 위한 차이등급 clustering)

  • Rhee, Chul;Kang, Suk-Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.14 no.1
    • /
    • pp.108-118
    • /
    • 1989
  • The clustering problem for multi-dimensional entities is investigated. A heuristic method, which is named as Difference Order Clustering (DOC) is developed for the grouping of multi-dimensional entities DOC method has an advantage of identifying the bottle-neck entities. Comparisons among the proposed DOC method, modified rank order clustering (MODROC) method, and lexicographical rank order clustering using minimum spanning tree (lexico-MMSTROC) are illustrated by a part type selection problems.

  • PDF

Clustering Algorithm for Sequences of Categorical Values (범주형 값들이 순서를 가지고 있는 데이터들의 클러스터링 기법)

  • 오승준;김재련
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.26 no.1
    • /
    • pp.17-21
    • /
    • 2003
  • We study clustering algorithm for sequences of categorical values. Clustering is a data mining problem that has received significant attention by the database community. Traditional clustering algorithms deal with numerical or categorical data points. However, there exist many important databases that store categorical data sequences. In this paper, we introduce new similarity measure and develop a hierarchical clustering algorithm. An experimental section shows performance of the proposed approach.

VS-FCM: Validity-guided Spatial Fuzzy c-Means Clustering for Image Segmentation

  • Kang, Bo-Yeong;Kim, Dae-Won
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.1
    • /
    • pp.89-93
    • /
    • 2010
  • In this paper a new fuzzy clustering approach to the color clustering problem has been proposed. To deal with the limitations of the traditional FCM algorithm, we propose a spatial homogeneity-based FCM algorithm. Moreover, the cluster validity index is employed to automatically determine the number of clusters for a given image. We refer to this method as VS-FCM algorithm. The effectiveness of the proposed method is demonstrated through various clustering examples.

Clustering by Accelerated Simulated Annealing

  • Yoon, Bok-Sik;Ree, Sang-Bok
    • Korean Management Science Review
    • /
    • v.15 no.2
    • /
    • pp.153-159
    • /
    • 1998
  • Clustering or classification is a very fundamental task that may occur almost everywhere for the purpose of grouping. Optimal clustering is an example of very complicated combinatorial optimization problem and it is hard to develop a generally applicable optimal algorithm. In this paper we propose a general-purpose algorithm for the optimal clustering based on SA(simulated annealing). Among various iterative global optimization techniques imitating natural phenomena that have been proposed and utilized successfully for various combinatorial optimization problem, simulated annealing has its superiority because of its convergence property and simplicity. We first present a version of accelerated simulated annealing(ASA) and then we apply ASA to develop an efficient clustering algorithm. Application examples are also given.

  • PDF

An Energy Efficient Algorithm Based on Clustering Formulation and Scheduling for Proportional Fairness in Wireless Sensor Networks

  • Cheng, Yongbo;You, Xing;Fu, Pengcheng;Wang, Zemei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.559-573
    • /
    • 2016
  • In this paper, we investigate the problem of achieving proportional fairness in hierarchical wireless sensor networks. Combining clustering formulation and scheduling, we maximize total bandwidth utility for proportional fairness while controlling the power consumption to a minimum value. This problem is decomposed into two sub-problems and solved in two stages, which are Clustering Formulation Stage and Scheduling Stage, respectively. The above algorithm, called CSPF_PC, runs in a network formulation sequence. In the Clustering Formulation Stage, we let the sensor nodes join to the cluster head nodes by adjusting transmit power in a greedy strategy; in the Scheduling Stage, the proportional fairness is achieved by scheduling the time-slot resource. Simulation results verify the superior performance of our algorithm over the compared algorithms on fairness index.

Nearest neighbor and validity-based clustering

  • Son, Seo H.;Seo, Suk T.;Kwon, Soon H.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.4 no.3
    • /
    • pp.337-340
    • /
    • 2004
  • The clustering problem can be formulated as the problem to find the number of clusters and a partition matrix from a given data set using the iterative or non-iterative algorithms. The author proposes a nearest neighbor and validity-based clustering algorithm where each data point in the data set is linked with the nearest neighbor data point to form initial clusters and then a cluster in the initial clusters is linked with the nearest neighbor cluster to form a new cluster. The linking between clusters is continued until no more linking is possible. An optimal set of clusters is identified by using the conventional cluster validity index. Experimental results on well-known data sets are provided to show the effectiveness of the proposed clustering algorithm.