Search | Korea Science

Empirical Comparisons of Clustering Algorithms using Silhouette Information

Jun, Sung-Hae;Lee, Seung-Joo
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.10 no.1
- /
- pp.31-36
- /
- 2010
Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.
https://doi.org/10.5391/IJFIS.2010.10.1.031 인용 PDF KSCI

The Document Clustering using Multi-Objective Genetic Algorithms (다목적 유전자 알고리즘을 이용한문서 클러스터링)

Lee, Jung-Song;Park, Soon-Cheol
- Journal of Korea Society of Industrial Information Systems
- /
- v.17 no.2
- /
- pp.57-64
- /
- 2012
In this paper, the multi-objective genetic algorithm is proposed for the document clustering which is important in the text mining field. The most important function in the document clustering algorithm is to group the similar documents in a corpus. So far, the k-means clustering and genetic algorithms are much in progress in this field. However, the k-means clustering depends too much on the initial centroid, the genetic algorithm has the disadvantage of coming off in the local optimal value easily according to the fitness function. In this paper, the multi-objective genetic algorithm is applied to the document clustering in order to complement these disadvantages while its accuracy is analyzed and compared to the existing algorithms. In our experimental results, the multi-objective genetic algorithm introduced in this paper shows the accuracy improvement which is superior to the k-means clustering(about 20 %) and the general genetic algorithm (about 17 %) for the document clustering.
https://doi.org/10.9723/jksiis.2012.17.2.057 인용 PDF KSCI

A Study of an Extended Fuzzy Cluster Analysis on Special Shape Data (특별한 형태의 자료에 대한 확장된 Fuzzy 집락분석방법에 관한 연구)

임대혁
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.25 no.6
- /
- pp.36-41
- /
- 2002
We consider the Fuzzy clustering which is devised for partitioning a set of objects into a certain number of groups by assigning the membership probabilities to each object. The researches carried out in this field before show that the Fuzzy clustering concept is involved so much that for a certain set of data, the main purpose of the clustering cannot be attained as desired. Thus we propose a new objective function, named as Fuzzy-Entroppy Function in order to satisfy the main motivation of the clustering which is classifying the data clearly. Also we suggest Mean Field Annealing Algorithm as an optimization algorithm rather than the ISODATA used traditionally in this field since the objective function is changed. we show the Mean Field Annealing Algorithm works pretty well not only for the new objective function but also for the classical Fuzzy objective function by indicating that the local minimum problem resulted from the ISODATA can be improved.
PDF KSCI

Magnetoencephalography Interictal Spike Clustering in Relation with Surgical Outcome of Cortical Dysplasia

Jeong, Woorim;Chung, Chun Kee;Kim, June Sic
- Journal of Korean Neurosurgical Society
- /
- v.52 no.5
- /
- pp.466-471
- /
- 2012
Objective : The aim of this study was to devise an objective clustering method for magnetoencephalography (MEG) interictal spike sources, and to identify the prognostic value of the new clustering method in adult epilepsy patients with cortical dysplasia (CD). Methods : We retrospectively analyzed 25 adult patients with histologically proven CD, who underwent MEG examination and surgical resection for intractable epilepsy. The mean postoperative follow-up period was 3.1 years. A hierarchical clustering method was adopted for MEG interictal spike source clustering. Clustered sources were then tested for their prognostic value toward surgical outcome. Results : Postoperative seizure outcome was Engel class I in 6 (24%), class II in 3 (12%), class III in 12 (48%), and class IV in 4 (16%) patients. With respect to MEG spike clustering, 12 of 25 (48%) patients showed 1 cluster, 2 (8%) showed 2 or more clusters within the same lobe, 10 (40%) showed 2 or more clusters in a different lobe, and 1 (4%) patient had only scattered spikes with no clustering. Patients who showed focal clustering achieved better surgical outcome than distributed cases (p=0.017). Conclusion : This is the first study that introduces an objective method to classify the distribution of MEG interictal spike sources. By using a hierarchical clustering method, we found that the presence of focal clustered spikes predicts a better postoperative outcome in epilepsy patients with CD.
https://doi.org/10.3340/jkns.2012.52.5.466 인용 PDF KSCI

A Mixed Co-clustering Algorithm Based on Information Bottleneck

Liu, Yongli;Duan, Tianyi;Wan, Xing;Chao, Hao
- Journal of Information Processing Systems
- /
- v.13 no.6
- /
- pp.1467-1486
- /
- 2017
Fuzzy co-clustering is sensitive to noise data. To overcome this noise sensitivity defect, possibilistic clustering relaxes the constraints in FCM-type fuzzy (co-)clustering. In this paper, we introduce a new possibilistic fuzzy co-clustering algorithm based on information bottleneck (ibPFCC). This algorithm combines fuzzy co-clustering and possibilistic clustering, and formulates an objective function which includes a distance function that employs information bottleneck theory to measure the distance between feature data point and feature cluster centroid. Many experiments were conducted on three datasets and one artificial dataset. Experimental results show that ibPFCC is better than such prominent fuzzy (co-)clustering algorithms as FCM, FCCM, RFCC and FCCI, in terms of accuracy and robustness.
https://doi.org/10.3745/JIPS.01.0019 인용 PDF KSCI

A Study on an Extended Fuzzy Cluster Analysis (확장된 Fuzzy 집락분석방법에 관한 연구)

Im Dae-Heug
- Management & Information Systems Review
- /
- v.9
- /
- pp.25-39
- /
- 2002
We consider the Fuzzy clustering which is devised for partitioning a set of objects into a certain number of groups by assigning the membership probabilities to each object. The researches carried out in this field before show that the Fuzzy clustering concept is involved so much that for a certain set of data, the main purpose of the clustering cannot be attained as desired. Thus we propose a new objective function, named as Fuzzy-Entroppy Function in order to satisfy the main motivation of the clustering which is classifying the data clearly. Also we suggest Mean Field Annealing Algorithm as an optimization algorithm rather than the. ISODATA used traditionally in this field since the objective function is changed. We show the Mean Field Annealing Algorithm works pretty well not only for the new objective function but also for the classical Fuzzy objective function by indicating that the local minimum problem resulted from the ISODATA can be improved.
PDF

A Study of Simulation Method and New Fuzzy Cluster Analysis (새로운 Fuzzy 집락분석방법과 Simulation기법에 관한 연구)

Im Dae-Heug
- Management & Information Systems Review
- /
- v.14
- /
- pp.51-65
- /
- 2004
We consider the Fuzzy clustering which is devised for partitioning a set of objects into a certain number of groups by assigning the membership probabilities to each object. The researches carried out in this field before show that the Fuzzy clustering concept is involved so much that for a certain set of data, the main purpose of the clustering cannot be attained as desired. Thus we Propose a new objective function, named as Fuzzy-Entroppy Function in order to satisfy the main motivation of the clustering which is classifying the data clearly. Also we suggest Mean Field Annealing Algorithm as an optimization algorithm rather than the ISODATA used traditionally in this field since the objective function is changed. We show the Mean Field Annealing Algorithm works pretty well not only for the new objective function but also for the classical Fuzzy objective function by indicating that the local minimum problem resulted from the ISODATA can be improved.
PDF

Research on Low-energy Adaptive Clustering Hierarchy Protocol based on Multi-objective Coupling Algorithm

Li, Wuzhao;Wang, Yechuang;Sun, Youqiang;Mao, Jie
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.4
- /
- pp.1437-1459
- /
- 2020
Wireless Sensor Networks (WSN) is a distributed Sensor network whose terminals are sensors that can sense and check the environment. Sensors are typically battery-powered and deployed in where the batteries are difficult to replace. Therefore, maximize the consumption of node energy and extend the network's life cycle are the problems that must to face. Low-energy adaptive clustering hierarchy (LEACH) protocol is an adaptive clustering topology algorithm, which can make the nodes in the network consume energy in a relatively balanced way and prolong the network lifetime. In this paper, the novel multi-objective LEACH protocol is proposed, in order to solve the proposed protocol, we design a multi-objective coupling algorithm based on bat algorithm (BA), glowworm swarm optimization algorithm (GSO) and bacterial foraging optimization algorithm (BFO). The advantages of BA, GSO and BFO are inherited in the multi-objective coupling algorithm (MBGF), which is tested on ZDT and SCH benchmarks, the results are shown the MBGF is superior. Then the multi-objective coupling algorithm is applied in the multi-objective LEACH protocol, experimental results show that the multi-objective LEACH protocol can greatly reduce the energy consumption of the node and prolong the network life cycle.
https://doi.org/10.3837/tiis.2020.04.003 인용 PDF KSCI HTML

On the clustering of huge categorical data

Kim, Dae-Hak
- Journal of the Korean Data and Information Science Society
- /
- v.21 no.6
- /
- pp.1353-1359
- /
- 2010
Basic objective in cluster analysis is to discover natural groupings of items. In general, clustering is conducted based on some similarity (or dissimilarity) matrix or the original input data. Various measures of similarities between objects are developed. In this paper, we consider a clustering of huge categorical real data set which shows the aspects of time-location-activity of Korean people. Some useful similarity measure for the data set, are developed and adopted for the categorical variables. Hierarchical and nonhierarchical clustering method are applied for the considered data set which is huge and consists of many categorical variables.
PDF KSCI

An Overview of Unsupervised and Semi-Supervised Fuzzy Kernel Clustering

Frigui, Hichem;Bchir, Ouiem;Baili, Naouel
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.13 no.4
- /
- pp.254-268
- /
- 2013
For real-world clustering tasks, the input data is typically not easily separable due to the highly complex data structure or when clusters vary in size, density and shape. Kernel-based clustering has proven to be an effective approach to partition such data. In this paper, we provide an overview of several fuzzy kernel clustering algorithms. We focus on methods that optimize an fuzzy C-mean-type objective function. We highlight the advantages and disadvantages of each method. In addition to the completely unsupervised algorithms, we also provide an overview of some semi-supervised fuzzy kernel clustering algorithms. These algorithms use partial supervision information to guide the optimization process and avoid local minima. We also provide an overview of the different approaches that have been used to extend kernel clustering to handle very large data sets.
https://doi.org/10.5391/IJFIS.2013.13.4.254 인용 PDF KSCI

Search Result 224, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)