• Title/Summary/Keyword: optimal number of clusters

Search Result 78, Processing Time 0.033 seconds

A Study of optimized clustering method based on SOM for CRM

  • Jong T. Rhee;Lee, Joon.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.464-469
    • /
    • 2001
  • CRM(Customer Relationship Management : CRM) is an advanced marketing supporting system which analyze customers\` transaction data and classify or target customer groups to effectively increase market share and profit. Many engines were developed to implements the function and those for classification and clustering are considered core ones. In this study, an improved clustering method based on SOM(Self-Organizing Maps : SOM) is proposed. The proposed clustering method finds the optimal number of clusters so that the effectiveness of clustering is increased. It considers all the data types existing in CRM data warehouses. In particular, and adaptive algorithm where the concepts of degeneration and fusion are applied to find optimal number of clusters. The feasibility and efficiency of the proposed method are demonstrated through simulation with simplified data of customers.

  • PDF

Optimized Energy Cluster Routing for Energy Balanced Consumption in Low-cost Sensor Network

  • Han, Dae-Man;Koo, Yong-Wan;Lim, Jae-Hyun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.4 no.6
    • /
    • pp.1133-1151
    • /
    • 2010
  • Energy balanced consumption routing is based on assumption that the nodes consume energy both in transmitting and receiving. Lopsided energy consumption is an intrinsic problem in low-cost sensor networks characterized by multihop routing and in many traffic overhead pattern networks, and this irregular energy dissipation can significantly reduce network lifetime. In this paper, we study the problem of maximizing network lifetime through balancing energy consumption for uniformly deployed low-cost sensor networks. We formulate the energy consumption balancing problem as an optimal balancing data transmitting problem by combining the ideas of corona cluster based network division and optimized transmitting state routing strategy together with data transmission. We propose a localized cluster based routing scheme that guarantees balanced energy consumption among clusters within each corona. We develop a new energy cluster based routing protocol called "OECR". We design an offline centralized algorithm with time complexity O (log n) (n is the number of clusters) to solve the transmitting data distribution problem aimed at energy balancing consumption among nodes in different cluster. An approach for computing the optimal number of clusters to maximize the network lifetime is also presented. Based on the mathematical model, an optimized energy cluster routing (OECR) is designed and the solution for extending OEDR to low-cost sensor networks is also presented. Simulation results demonstrate that the proposed routing scheme significantly outperforms conventional energy routing schemes in terms of network lifetime.

Nearest neighbor and validity-based clustering

  • Son, Seo H.;Seo, Suk T.;Kwon, Soon H.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.4 no.3
    • /
    • pp.337-340
    • /
    • 2004
  • The clustering problem can be formulated as the problem to find the number of clusters and a partition matrix from a given data set using the iterative or non-iterative algorithms. The author proposes a nearest neighbor and validity-based clustering algorithm where each data point in the data set is linked with the nearest neighbor data point to form initial clusters and then a cluster in the initial clusters is linked with the nearest neighbor cluster to form a new cluster. The linking between clusters is continued until no more linking is possible. An optimal set of clusters is identified by using the conventional cluster validity index. Experimental results on well-known data sets are provided to show the effectiveness of the proposed clustering algorithm.

Feature Weighting in Projected Clustering for High Dimensional Data (고차원 데이타에 대한 투영 클러스터링에서 특성 가중치 부여)

  • Park, Jong-Soo
    • Journal of KIISE:Databases
    • /
    • v.32 no.3
    • /
    • pp.228-242
    • /
    • 2005
  • The projected clustering seeks to find clusters in different subspaces within a high dimensional dataset. We propose an algorithm to discover near optimal projected clusters without user specified parameters such as the number of output clusters and the average cardinality of subspaces of projected clusters. The objective function of the algorithm computes projected energy, quality, and the number of outliers in each process of clustering. In order to minimize the projected energy and to maximize the quality in clustering, we start to find best subspace of each cluster on the density of input points by comparing standard deviations of the full dimension. The weighting factor for each dimension of the subspace is used to get id of probable error in measuring projected distances. Our extensive experiments show that our algorithm discovers projected clusters accurately and it is scalable to large volume of data sets.

A Cluster Validity Index Using Overlap and Separation Measures Between Fuzzy Clusters (클러스터간 중첩성과 분리성을 이용한 퍼지 분할의 평가 기법)

  • Kim, Dae-Won;Lee, Kwang-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.455-460
    • /
    • 2003
  • A new cluster validity index is proposed that determines the optimal partition and optimal number of clusters for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index exploits an overlap measure and a separation measure between clusters. The overlap measure is obtained by computing an inter-cluster overlap. The separation measure is obtained by computing a distance between fuzzy clusters. A good fuzzy partition is expected to have a low degree of overlap and a larger separation distance. Testing of the proposed index and nine previously formulated indexes on well-known data sets showed the superior effectiveness and reliability of the proposed index in comparison to other indexes.

A Fuzzy C Elliptic Shells Clustering

  • 김대진
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.4
    • /
    • pp.18-22
    • /
    • 1998
  • This paper presents a fuzzy c elliptic shells algorithm that detects culusters than can be expressed by hyperellipsoidal shells. The algorithm is computationally efficient since the prototyes of shell clusters are determined by a simple matrix inversion instead of by solving several nonlinear equations. The algorithm also works when the detected shells are partial the optimal number of clusters is unkonown initially. A set of simulation results validates the proposed clustering mehtod.

  • PDF

Extensions of X-means with Efficient Learning the Number of Clusters (X-means 확장을 통한 효율적인 집단 개수의 결정)

  • Heo, Gyeong-Yong;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.4
    • /
    • pp.772-780
    • /
    • 2008
  • K-means is one of the simplest unsupervised learning algorithms that solve the clustering problem. However K-means suffers the basic shortcoming: the number of clusters k has to be known in advance. In this paper, we propose extensions of X-means, which can estimate the number of clusters using Bayesian information criterion(BIC). We introduce two different versions of algorithm: modified X-means(MX-means) and generalized X-means(GX-means), which employ one full covariance matrix for one cluster and so can estimate the number of clusters efficiently without severe over-fitting which X-means suffers due to its spherical cluster assumption. The algorithms start with one cluster and try to split a cluster iteratively to maximize the BIC score. The former uses K-means algorithm to find a set of optimal clusters with current k, which makes it simple and fast. However it generates wrongly estimated centers when the clusters are overlapped. The latter uses EM algorithm to estimate the parameters and generates more stable clusters even when the clusters are overlapped. Experiments with synthetic data show that the purposed methods can provide a robust estimate of the number of clusters and cluster parameters compared to other existing top-down algorithms.

Analysis of Cluster-based Truck-Drone Delivery Routing Models (군집 기반 트럭-드론 배송경로 모형의 효과분석)

  • Chang, Yong Sik
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.53-64
    • /
    • 2019
  • The purpose of this study is to find out the fast delivery route that several drones return a truck again after departing from it for delivery locations at each cluster while the truck goes through the cluster composed of several delivery locations. The main issue is to reduce the total delivery time composed of the delivery time by relatively slow trucks via clusters and the sum of maximum delivery times by relatively fast drones in each cluster. To solve this problem, we use a three-step heuristic approach. First, we cluster the nearby delivery locations with minimal number of clusters satisfying a constraint of drone flight distance to set delivery paths for drones in each cluster. Second, we set an optimal delivery route for a truck through centers of the clusters using the TSP model. Finally, we find out the moved centers of clusters while maintaining the delivery paths for the truck and drones and satisfying the constraint of drone flight. distance in the two-dimensional region to reduce the total delivery time. In order to analyze the effect of this study model according to the change of the number of delivery locations, we developed a R-based simulation prototype and compared the relative efficiency, and performed paired t-test between TSP model and the cluster-based models. This study showed its excellence through this experimentation.

Performance Evaluation of AMC in Clustered OFDM System

  • Cho, Ju-Phil
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.12
    • /
    • pp.1623-1630
    • /
    • 2005
  • Adaptive modulation and coding (AMC), which has a number of variation levels in accordance with the fading channel variation, is a promising technique for communication systems. In this paper, we present an AMC method using the cluster in OFDM system for bandwidth efficiency and performance improvement. The AMC schemes applied into each cluster or some clusters are determined by the minimum or the average SNR value among all the sub carriers within the corresponding cluster. It is important to find the optimal information on cluster because AMC performance can be varied according to the number and position of cluster. It is shown by computer simulation that the AMC method outperforms the fixed modulation in terms of bandwidth efficiency and its performance can be determined by the position and number of clusters.

  • PDF

Identifying the Optimal Number of Homogeneous Regions for Regional Frequency Analysis Using Self-Organizing Map (자기조직화지도를 활용한 동일강수지역 최적군집수 분석)

  • Kim, Hyun Uk;Sohn, Chul;Han, Sang-Ok
    • Spatial Information Research
    • /
    • v.20 no.6
    • /
    • pp.13-21
    • /
    • 2012
  • In this study, homogeneous regions for regional frequency analysis were identified using rainfall data from 61 observation points in Korea. The used data were gathered from 1980 to 2010. Self organizing map and K-means clustering based on Davies-Bouldin Index were used to make clusters showing similar rainfall patterns and to decide the optimum number of the homogeneous regions. The results from this analysis showed that the 61 observation points can be optimally grouped into 6 geographical clusters. Finally, the 61 observations points grouped into 6 clusters were mapped regionally using Thiessen polygon method.