An Improved K-means Document Clustering using Concept Vectors

Shin, Yang-Kyu;

Journal of the Korean Data and Information Science Society

제14권4호
/
Pages.853-861
/
2003
/
1598-9402(pISSN)

한국데이터정보과학회 (The Korean Data and Information Science Society)

An Improved K-means Document Clustering using Concept Vectors

Shin, Yang-Kyu

발행 : 2003.11.30

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

An improved K-means document clustering method has been presented, where a concept vector is manipulated for each cluster on the basis of cosine similarity of text documents. The concept vectors are unit vectors that have been normalized on the n-dimensional sphere. Because the standard K-means method is sensitive to initial starting condition, our improvement focused on starting condition for estimating the modes of a distribution. The improved K-means clustering algorithm has been applied to a set of text documents, called Classic3, to test and prove efficiency and correctness of clustering result, and showed 7% improvements in its worst case.

Journal of the Korean Data and Information Science Society

An Improved K-means Document Clustering using Concept Vectors

초록

키워드

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)