• 제목/요약/키워드: silhouette statistics

검색결과 22건 처리시간 0.022초

Empirical Comparisons of Clustering Algorithms using Silhouette Information

  • Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제10권1호
    • /
    • pp.31-36
    • /
    • 2010
  • Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.

효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교 (Comparison of clustering with yeast microarray gene expression data)

  • 이경아;김재희
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권4호
    • /
    • pp.741-753
    • /
    • 2011
  • 마이크로어레이 유전자 발현데이터인 효모데이터를 이용하여 군집분석을 실시하였다. 모형기반 군집방법, K-평균법, 중앙값 중심분포 (PAM), 자기 조직화 지도 (SOM), 계층적 Ward 군집방법을 이용하여 군집화를 실시하고, 연결성 측도 (connectivity), Dunn지수, 실루엣 측도 (silhouette)를 이용하여 각 군집방법에 대한 유효성을 측정하고 군집분석 결과를 비교하고자한다.

A Study on Classification and Localization of Structural Damage through Wavelet Analysis

  • 고봉환;정욱
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2007년도 추계학술대회논문집
    • /
    • pp.754-759
    • /
    • 2007
  • This study exploits the data discriminating capability of silhouette statistics, which combines wavelet-based vertical energy threshold technique for the purpose of extracting damage-sensitive features and clustering signals of the same class. This threshold technique allows to first obtain a suitable subset of the extracted or modified features of our data, i.e., good predictor sets should contain features that are strongly correlated to the characteristics of the data without considering the classification method used, although each of these features should be as uncorrelated with each other as possible. The silhouette statistics have been used to assess the quality of clustering by measuring how well an object is assigned to its corresponding cluster. We use this concept for the discriminant power function used in this paper. The simulation results of damage detection in a truss structure show that the approach proposed in this study can be successfully applied for locating both open- and breathing-type damage even in the presence of a considerable amount of process and measurement noise.

  • PDF

효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석 (Gene Screening and Clustering of Yeast Microarray Gene Expression Data)

  • 이경아;김태훈;김재희
    • 응용통계연구
    • /
    • 제24권6호
    • /
    • pp.1077-1094
    • /
    • 2011
  • 마이크로어레이 유전자 발현 데이터인 yeast cdc15에 대해 시계열 데이터의 특성을 반영한 푸리에 계수를 이용한 검정통계량과 FDR 다중비교법을 이용하여 차별화된 유전자를 선별한 후 선별된 유전자들에 대해 모형기반 군집방법, K-평균법, PAM, SOM, 계층적 Ward 군집방법과 Fuzzy 군집방법을 실시하였다. 군집방법에 따른 특성을 알아보고 군집화 결과와 내부유효성 측도로 연결성 측도, Dunn 지수와 실루엣 값을 살펴본다. 또한 GO분석을 통한 생물학적 의미도 파악해본다.

고차원 (유전자 발현) 자료에 대한 군집 타당성분석 기법의 성능 비교 (Comparison of the Cluster Validation Methods for High-dimensional (Gene Expression) Data)

  • 정윤경;백장선
    • 응용통계연구
    • /
    • 제20권1호
    • /
    • pp.167-181
    • /
    • 2007
  • 유전자 발현 자료(gene expression data)는 전형적인 고차원 자료이며, 이를 분석하기 위한 여러 가지 군집 알고리즘(clustering algorithm)과 군집 결과들을 검증하는 군집타당성분석 기법(cluster validation technique)이 제안되고 있지만, 이들 군집 타당성을 분석하는 기법의 성능에 대한 비교, 평가는 매우 드물다. 본 논문에서는 저차원의 모의실험 자료와 실제 유전자 발현 자료에 대하여 군집 타당성분석 기법들의 성능을 비교하였으며, 그 결과 내적 측도에서는 Dunn 지수, Silhouette 지수 순으로 뛰어났고 외적 측도에서는 Jaccard 지수가 성능이 가장 우수한 것으로 평가되었다.

패션 컬렉션에 나타난 진패션의 형태적 디자인 특성 (The Design Characteristics of Form of Jean Fashion in Fashion Collections)

  • 진박;김애경;이경희
    • 한국콘텐츠학회논문지
    • /
    • 제12권12호
    • /
    • pp.577-586
    • /
    • 2012
  • 이 연구는 진재킷과 진팬츠에 관한 연구로 패션컬렉션에 제시된 진재킷과 진팬츠의 디자인 특성을 분석하고 활용 가능성을 모색함으로써 진패션 디자인 연구와 패션산업의 상품 기획에 기초자료를 제공하고자 한다. 2007년 S/S시즌에서 2011년 F/W시즌까지의 사진을 수집하여 통계프로그램 SPAW를 활용하여 빈도, 백분율을 사용하여 분석하였다. 연구 결과를 요약하고 결론을 정리하면 다음과 같다. 남성의 진재킷에서 심플한 디테일에 미디움길이, 사각형실루엣을 활용하고, 여성은 여성스러움을 강조한 X자형 실루엣과 쇼트길이의 다양한 디테일을 활용한다. 남성의 진팬츠는 스트레이트 실루엣과 컴포트 실루엣으로 활용하고 여성은 다양한 실루엣, 피트, 길이로 다양한 이미지로 진을 활용할 수 있을 것이다.

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • 제27권6호
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

타이트 스커트 실루엣 및 길이에 따른 동작적합성과 트임길이에 관한 연구 (A Study on Moving Fitness and Slit Length in Relation to Length & Silhouette of Tight Skirt)

  • 김희영;최혜선
    • 한국의류학회지
    • /
    • 제17권4호
    • /
    • pp.539-549
    • /
    • 1993
  • The purpose of this study was to find out the moving fitness and slit length of tight skirt in relation to its length & silhouette. Five kinds of length, micro mini, mini, natural line, medi and maxi, and two kinds of siihuette, slim & straight-a total of ten tight skirts-were investigated. Ten college students were chosen for this experiment. The moving fitness was tested by measuring the step length, step width and step angle in the case of walking on the flat and going up the stairway & bus stair. The slit length was tested by measuring the back slit length needed in the case of going up stairway & bus stair. Data were analyzed with use of SAS pakage. The statistics were based on average, standard diviation, two-way ANOVA, Pearson's correlation and multiple regression analysis. The main results were as follows. 1. There was significant difference in the moving fitness according to length & silhouette of tight skirt. The moving fitness of slim type was lower than that of straight type and the longer the skirt length was, the lower the moving fitness was. The significance appeared particularly in the case of going up the bus stair. 2. There was significant difference in the skirt length obove slit accorting to length & silhouette of tight skirt. The skirt length obove slit of slim type was shorter than that of straight type. The longer skirt length was, the longer it was from micro mini to natural line, that of medi skirt was shorter or a little longer than that of natural line skirt and there was little change from medi skirt to maxi skirt.

  • PDF

의복설계를 위한 중년여성의 체형별 특징 및 신체만족도 (Body Features and Body Satisfaction of Middle-aged Women for Clothing Design)

  • 김경희
    • 한국의상디자인학회지
    • /
    • 제10권2호
    • /
    • pp.57-68
    • /
    • 2008
  • In this study, we prepared reference data needed for clothing design for middle-aged women by analyzing body satisfaction of their body shape, which had been classified by collecting body features of middle-aged women. As for the study method, we have set five scales from 'never satisfied' to 'very much satisfied,' after analyzing body features of middle-aged women by measuring their body shape through the body meter and auxiliary tools. We used the SPSS 12.0 statistics program, and the results are the following: Body shapes of middle-aged women can be classified into the following four types. A middle-age women with an 'A silhouette' has a normal height, but fat nether limbs. A 'Y silhouette' is short with a fat upper body. The 'O silhouette' is short with fat nether limbs and upper body, and 'H silhouette' is tall and thin. Body shape I has displayed satisfaction with her own body shape, and body shape II showed the most dissatisfaction compared to other body shapes. Body shape III showed satisfaction on all items except face size and breast size, whereas body shape IV was dissatisfied with her face size, neck length, shape of her breast, waist, and buttocks. The result of this study is expected to contribute in accomplishing clothing production that will satisfy the desire of the consumers in the clothing business, while being utilized as the basic data for clothing design that fits their body shape by grasping the changing patterns of their body shape.

  • PDF

군집분석 비교 및 한우 관능평가데이터 군집화 (A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls)

  • 김재희;고윤실
    • 응용통계연구
    • /
    • 제22권4호
    • /
    • pp.745-758
    • /
    • 2009
  • 자발적인 군집을 유도하는 다변량 통계기법으로 널리 사용되는 군집분석은 데이터에 기반한 탐색적 방법으로 쓰이며 군집원칙에 따라 여러 가지 방법이 제안되어 왔다. 또한 군집화된 결과에 대하여 유효성을 측정하는 측도도 다양한방법이 개발되었다. 본 연구에서는 계층적 군집분석 방법으로 최장연결법과 Ward의 방법, 비계층적 군집분석 방법으로 K-평균법 그리고 확률분포정보를 활용한 모형기반 군집분석방법을 이용하여 모의실험으로 군집분석을 실시하고 군집유효성 측도로는 연결성, Dunn 지수, 실루엣을 구하여 각 군집방법에 대해 유효성을 비교한다. 또한, 한우 관능평가 데이터에 군집분석을 적용하여 최적의 군집 상황을 구하고자 한다.