• Title/Summary/Keyword: K-평균 군집법

Search Result 63, Processing Time 0.023 seconds

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

  • Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1077-1094
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.

K-means clustering using a center of gravity for grid-based sample (그리드 기반 표본의 무게중심을 이용한 케이-평균군집화)

  • Lee, Sun-Myung;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.121-128
    • /
    • 2010
  • K-means clustering is an iterative algorithm in which items are moved among sets of clusters until the desired set is reached. K-means clustering has been widely used in many applications, such as market research, pattern analysis or recognition, image processing, etc. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters that we want, because it is more primitive, explorative. In this paper we propose a new method of k-means clustering using a center of gravity for grid-based sample. It is more fast than any traditional clustering method and maintains its accuracy.

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

  • Kim, Jae-Hee;Ko, Yoon-Sil
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.745-758
    • /
    • 2009
  • Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.

A Major DNA Marker Mining of microsatellite loci in Hanwoo Chromosome 17

  • Lee, Yong-Won;Lee, Je-Yeong
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.54-58
    • /
    • 2005
  • 한우 17번 염색체 유전자 지도에서 QTL (quantitative trait loci) 분석을 실시하여 선별된 Loci 값들을 순열검정(Permutation Test)을 이용하여 유의성 검정을 실시하였다. 한편, 우수 경제형질 DNA marker들을 K-평균 군집법을 실시 파악하였다. 또한, 부스트랩 방법을 이용하여 선별된 Locus의 DNA Marker들의 신뢰구간을 구하였다. 이들 QTL과 K-평균법, 부스트랩 방법에 의해 한우의 염색체 17번 BMS941의 우수 DNA Marker 85, 105번을 선별하였다.

  • PDF

순열검정과 부스트랩 방법에 의한 한우 6번 염색체의 ILSTS035에 대한 우수 DNA Marker 선별

  • Lee, Yong-Won;Lee, Je-Yeong;Kim, Mun-Jeong;Han, Cho-Hui
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.325-329
    • /
    • 2003
  • 한우 6번 염색체 유전자 지도에서 QTL (quantitative trait loci) 분석을 실시하여 선별된Locus 값을 순열검정(Permutation Test)을 이용하여 유의성 검정을 실시하였다. 한편, 우수경제형질 DNA marker들을 K-평균 군집법을 실시 파악하였다. 이들 QTL과 K-평균법에 의해 한우의 염색체 6번 ILSTS035의 우수 DNA marker 235번을 선별하였다. 선별된 DNA Marker 235번을 출품우에 적용하여 Bootstrap 방법을 이용하여 신뢰구간을 구하여 검정하였다.

  • PDF

K-평균 군집분석을 활용한 다중대응분석의 재해석

  • 김경희;최용석
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.175-178
    • /
    • 2001
  • 다원분할표에서 범주들의 대응관계를 그래프적으로 보여주는 다중대응분석(multiple correspondence analysis)은 주결여성(principal inertia)이 총결여성(total inertia)에서 차지하는 비율이 전반적으로 낮아 설명력(goodness-of-fit)이 낮은 2차원의 대응분석그림을 얻게 된다. 이를 극복하기 위해 Benzecri의 공식을 사용하면 낮은 주결여성을 높이고 새로운 2차원 대응분석그림을 얻을 수 있다. 그러나 이 새로운 대응분석그림도 범주들의 대응관계를 명확히 보여주지는 못한다(Greenacre and Blasius, 1994, chapter 10). 앤드류 플롯(Andrews plot)을 이용하여 범주들의 군집화(clustering)로 다중대응분석을 재해석 하고자 하나 범주의 수가 많은 경우 해석상 어려움이 따른다. 본 소고에서 이와 같은 경우 K-평균 군집분석을 활용하여 다중대응분석의 해석을 용이하게 하고자 한다.

  • PDF

한우 6번 염색체의 Bootstrap기법을 이용한 우수 DNA 탐색

  • Lee, Je-Yeong;Yeo, Jeong-Su;Kim, Jae-Woo;Lee, Yong-Won;Kim, Mun-Jeong
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.05a
    • /
    • pp.41-47
    • /
    • 2003
  • 한우 6번 염색체 유전자 지도에서 한우의 질을 높이기 위한 QTL(quantitative trait loci)분석을 실시하여 선별된 Loci 값을 Permutation Test를 이용하여 계산하였다. 한편, 경제적으로 주요한 한우의 특성부위(질적부위와 육량등)에 따른, 우수 경제형질 DNA marker를 K-평균 군집법을 실시 파악하였다. 이들 QTL과 K-평균법에 의해 한우의 염색체 6번, ILST035의 주요 경제 형질별 DNA marker들을 선별하여, Bootstrap BCa방법을 이용하여 각 DNA marker들의 신뢰구간을 구했다.

  • PDF

Comparison of clustering with yeast microarray gene expression data (효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교)

  • Lee, Kyung-A;Kim, Jae-Hee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.741-753
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.

Selecting Technique of Accident Sections using K-mean Method (K-평균법을 이용한 고속도로 사고분석구간 분할기법 개발)

  • Lee, Ki-Young;Chang, Myung-Soon
    • International Journal of Highway Engineering
    • /
    • v.7 no.4 s.26
    • /
    • pp.211-219
    • /
    • 2005
  • A selection of the analysis section for traffic accidents is used to analyze definitely the cause of accidents sorting similar accidents by a group and to raise the effect of improvement projects deciding the priority of accidents. In the existing method, an uniformly dividing method based on road mileages has been used, which has no consideration for similarities among accidents. Consequently, in recent, a slider-length method considering accident types rather than road mileages is widely used. In this study, using K-mean method, a non-hierarchical grouping technique used in the Cluster Analysis ai a applicatory method for the slider length method, a method classifies accidents that occurred the most nearby mileages into one group is proposed. To verify the proposed method, a comparison between the f-mean method and the dividing method at regular intervals on the data of a total of 25.6km lengths along Kyung-bu freeway in Pusan direction was made so that the K-mean method was proved to an effective method considering the similarities and adjacencies of accidents.

  • PDF

A Development of Customer Segmentation by Using Data Mining Technique (데이터마이닝에 의한 고객세분화 개발)

  • Jin Seo-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.555-565
    • /
    • 2005
  • To Know customers is very important for the company to survive in its cut-throat competition among coimpetitors. Companies need to manage the relationship with each ana every customer, ant make each of customers as profitable as possible. CRM (Customer relationship management) has emerged as a key solution for managing the profitable relationship. In order to achieve successful CRM customer segmentation is a essential component. Clustering as a data mining technique is very useful to build data-driven segmentation. This paper is concerned with building proper customer segmentation with introducing a credit card company case. Customer segmentation was built based only on transaction data which cattle from customer's activities. Two-step clustering approach which consists of k-means clustering and agglomerative clustering was applied for building a customer segmentation.