[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5391/JKIIS.2004.14.7.816

SVM based Clustering Technique for Processing High Dimensional Data

Kim, Man-Sun (한국표준과학연구원(KRISS) 정보전산그룹, 공주대학교 컴퓨터공학과)
Lee, Sang-Yong (공주대학교 정보통신공학부)

Publication Information

Journal of the Korean Institute of Intelligent Systems / v.14, no.7, 2004 , pp. 816-820 More about this Journal

Abstract

Clustering is a process of dividing similar data objects in data set into clusters and acquiring meaningful information in the data. The main issues related to clustering are the effective clustering of high dimensional data and optimization. This study proposed a method of measuring similarity based on SVM and a new method of calculating the number of clusters in an efficient way. The high dimensional data are mapped to Feature Space ones using kernel functions and then similarity between neighboring clusters is measured. As for created clusters, the desired number of clusters can be got using the value of similarity measured and the value of Δd. In order to verify the proposed methods, the author used data of six UCI Machine Learning Repositories and obtained the presented number of clusters as well as improved cohesiveness compared to the results of previous researches.

Keywords

SVM; Clustering; high dimensional data;

Citations & Related Records

Reference

1	R.. Pyle, DE. Hart, 'Pattern Classfication and Scene Analysis,' A Wiley-Interscience Publication, NewYork, 1973
2	Tian Zhang, Raghu Ramakrishnan, and Miron Livny, 'BIRCH:An Efficient Data Clustring Method for Very Large Databases', Proc. of ACMSIGMOD Int. Conf. on Management of Data, pp.103-114, 1996
3	이혜명, 박영배, '점진적 프로젝션을 이용한 고차원 글러스터링 기법', 한국정보과학회논문지:데이타베이스 Vol.28.No.4, pp.568-576, 2001
4	http://www.ics.uci.edu/
5	M.Ester, H. Kriegel, Jorg Sander, and Xiaowei Xu, 'A density-based algorithm for discovering clusters in large spatial database with noise', Proc. of Int .Conf. on Knowledge Discovery and Data Mining, 1996
6	송은정, 강인수, 김태원, 이기준, '클러스터링 분석에의한 공간 데이터마이닝 방법', 한국정보과학회 가을학술발표논문집(2), 1998
7	Tian Zhang, Raghu Ramakrishnan, and Miron, 'Birch : an efficient data clustering method for very large database,' the ACM SIGMOD Conference on Management of Data, Montreal, Canada, June, 1996
8	Raymond T.Ng,Jiawei Han, 'Efficient and Effective Clustering Methods for Spatial Data Mining', Proc. of 20th Int.Conf. on VLDB, pp.144-155, 1994
9	장미희,이혜명, 박영배, '고차원 데이터에서 2차원프로젝션을 이용한 클러스터링', 한국정보과학회 가을학술발표논문집 Vol.28.No.2, 2001
10	http://svm.cs.rhbnc.ac.uk
11	http://www.kernel-machines.org

KSCI

SVM based Clustering Technique for Processing High Dimensional Data 고차원 데이터 처리를 위한 SVM기반의 클러스터링 기법

SVM based Clustering Technique for Processing High Dimensional Data