Search | Korea Science

Empirical Comparisons of Clustering Algorithms using Silhouette Information

Jun, Sung-Hae;Lee, Seung-Joo
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.10 no.1
- /
- pp.31-36
- /
- 2010
Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.
https://doi.org/10.5391/IJFIS.2010.10.1.031 인용 PDF KSCI

Comparison of clustering with yeast microarray gene expression data (효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교)

Lee, Kyung-A;Kim, Jae-Hee
- Journal of the Korean Data and Information Science Society
- /
- v.22 no.4
- /
- pp.741-753
- /
- 2011
We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.
PDF KSCI

A Study on Classification and Localization of Structural Damage through Wavelet Analysis

Koh, Bong-Hwan;Jung, Uk
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2007.11a
- /
- pp.754-759
- /
- 2007
This study exploits the data discriminating capability of silhouette statistics, which combines wavelet-based vertical energy threshold technique for the purpose of extracting damage-sensitive features and clustering signals of the same class. This threshold technique allows to first obtain a suitable subset of the extracted or modified features of our data, i.e., good predictor sets should contain features that are strongly correlated to the characteristics of the data without considering the classification method used, although each of these features should be as uncorrelated with each other as possible. The silhouette statistics have been used to assess the quality of clustering by measuring how well an object is assigned to its corresponding cluster. We use this concept for the discriminant power function used in this paper. The simulation results of damage detection in a truss structure show that the approach proposed in this study can be successfully applied for locating both open- and breathing-type damage even in the presence of a considerable amount of process and measurement noise.
PDF

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
- The Korean Journal of Applied Statistics
- /
- v.24 no.6
- /
- pp.1077-1094
- /
- 2011
We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.
https://doi.org/10.5351/KJAS.2011.24.6.1077 인용 PDF KSCI

Comparison of the Cluster Validation Methods for High-dimensional (Gene Expression) Data (고차원 (유전자 발현) 자료에 대한 군집 타당성분석 기법의 성능 비교)

Jeong, Yun-Kyoung;Baek, Jang-Sun
- The Korean Journal of Applied Statistics
- /
- v.20 no.1
- /
- pp.167-181
- /
- 2007
Many clustering algorithms and cluster validation techniques for high-dimensional gene expression data have been suggested. The evaluations of these cluster validation techniques have, however, seldom been implemented. In this paper we compared various cluster validity indices for low-dimensional simulation data and real gene expression data, and found that Dunn's index is the most effective and robust, Silhouette index is next and Davies-Bouldin index is the bottom among the internal measures. Jaccard index is much more effective than Goodman-Kruskal index and adjusted Rand index among the external measures.
https://doi.org/10.5351/KJAS.2007.20.1.167 인용 PDF KSCI

The Design Characteristics of Form of Jean Fashion in Fashion Collections (패션 컬렉션에 나타난 진패션의 형태적 디자인 특성)

Pu, Chen;Kim, Ae-Kyung;Lee, Kyoung-Hee
- The Journal of the Korea Contents Association
- /
- v.12 no.12
- /
- pp.577-586
- /
- 2012
This research was focused on jean jacket and jean pant design characteristics in the collection. To offer a basic proposal for the development of jean jackets and pants, pictures of fashion web pages from 2007 to 2011 were used, and data were analysed by the usage of the frequency and percentage of the SPAW Statistics 18. The results of the research were as follows. Men's jackets were mainly medium in length with a tetragonal silhouette and simple detail. On the contrary, women's jackets were mainly of an X silhouette, short in length, and with varied details. Men's jean pants were mainly represented by a straight, comfortable silhouette while women's jean pants were characterized by a variety of silhouettes, fit, and lengths.
https://doi.org/10.5392/JKCA.2012.12.12.577 인용 PDF KSCI

Comparison of time series clustering methods and application to power consumption pattern clustering

Kim, Jaehwi;Kim, Jaehee
- Communications for Statistical Applications and Methods
- /
- v.27 no.6
- /
- pp.589-602
- /
- 2020
The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.
https://doi.org/10.29220/CSAM.2020.27.6.589 인용 PDF KSCI

A Study on Moving Fitness and Slit Length in Relation to Length & Silhouette of Tight Skirt (타이트 스커트 실루엣 및 길이에 따른 동작적합성과 트임길이에 관한 연구)

Kim, Hee Young;Choi, Hae Sun
- Journal of the Korean Society of Clothing and Textiles
- /
- v.17 no.4
- /
- pp.539-549
- /
- 1993
The purpose of this study was to find out the moving fitness and slit length of tight skirt in relation to its length & silhouette. Five kinds of length, micro mini, mini, natural line, medi and maxi, and two kinds of siihuette, slim & straight-a total of ten tight skirts-were investigated. Ten college students were chosen for this experiment. The moving fitness was tested by measuring the step length, step width and step angle in the case of walking on the flat and going up the stairway & bus stair. The slit length was tested by measuring the back slit length needed in the case of going up stairway & bus stair. Data were analyzed with use of SAS pakage. The statistics were based on average, standard diviation, two-way ANOVA, Pearson's correlation and multiple regression analysis. The main results were as follows. 1. There was significant difference in the moving fitness according to length & silhouette of tight skirt. The moving fitness of slim type was lower than that of straight type and the longer the skirt length was, the lower the moving fitness was. The significance appeared particularly in the case of going up the bus stair. 2. There was significant difference in the skirt length obove slit accorting to length & silhouette of tight skirt. The skirt length obove slit of slim type was shorter than that of straight type. The longer skirt length was, the longer it was from micro mini to natural line, that of medi skirt was shorter or a little longer than that of natural line skirt and there was little change from medi skirt to maxi skirt.
PDF

Body Features and Body Satisfaction of Middle-aged Women for Clothing Design (의복설계를 위한 중년여성의 체형별 특징 및 신체만족도)

Kim, Kyung-Hee
- Journal of the Korea Fashion and Costume Design Association
- /
- v.10 no.2
- /
- pp.57-68
- /
- 2008
In this study, we prepared reference data needed for clothing design for middle-aged women by analyzing body satisfaction of their body shape, which had been classified by collecting body features of middle-aged women. As for the study method, we have set five scales from 'never satisfied' to 'very much satisfied,' after analyzing body features of middle-aged women by measuring their body shape through the body meter and auxiliary tools. We used the SPSS 12.0 statistics program, and the results are the following: Body shapes of middle-aged women can be classified into the following four types. A middle-age women with an 'A silhouette' has a normal height, but fat nether limbs. A 'Y silhouette' is short with a fat upper body. The 'O silhouette' is short with fat nether limbs and upper body, and 'H silhouette' is tall and thin. Body shape I has displayed satisfaction with her own body shape, and body shape II showed the most dissatisfaction compared to other body shapes. Body shape III showed satisfaction on all items except face size and breast size, whereas body shape IV was dissatisfied with her face size, neck length, shape of her breast, waist, and buttocks. The result of this study is expected to contribute in accomplishing clothing production that will satisfy the desire of the consumers in the clothing business, while being utilized as the basic data for clothing design that fits their body shape by grasping the changing patterns of their body shape.
PDF

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

Kim, Jae-Hee;Ko, Yoon-Sil
- The Korean Journal of Applied Statistics
- /
- v.22 no.4
- /
- pp.745-758
- /
- 2009
Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.
https://doi.org/10.5351/KJAS.2009.22.4.745 인용 PDF KSCI

Search Result 22, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)