Initial Mode Decision Method for Clustering in Categorical Data

Yang, Soon-Cheol;Kang, Hyung-Chang;Kim, Chul-Soo;

Journal of the Korean Data and Information Science Society

Volume 18 Issue 2
/
Pages.481-488
/
2007
/
1598-9402(pISSN)

The Korean Data and Information Science Society (한국데이터정보과학회)

Initial Mode Decision Method for Clustering in Categorical Data

Published : 2007.04.30

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. The k-modes algorithm is to extend the k-means paradigm to categorical domains. The algorithm requires a pre-setting or random selection of initial points (modes) of the clusters. This paper improved the problem of k-modes algorithm, using the Max-Min method that is a kind of methods to decide initial values in k-means algorithm. we introduce new similarity measures to deal with using the categorical data for clustering. We show that the mushroom data sets and soybean data sets tested with the proposed algorithm has shown a good performance for the two aspects(accuracy, run time).

Journal of the Korean Data and Information Science Society

Initial Mode Decision Method for Clustering in Categorical Data

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)