Refinement of Document Clustering by Using NMF

  • Shinnou, Hiroyuki (Department of Computer and Information Sciences, Ibaraki University) ;
  • Sasaki, Minoru (Department of Computer and Information Sciences, Ibaraki University)
  • Published : 2007.11.01

Abstract

In this paper, we use non-negative matrix factorization (NMF) to refine the document clustering results. NMF is a dimensional reduction method and effective for document clustering, because a term-document matrix is high-dimensional and sparse. The initial matrix of the NMF algorithm is regarded as a clustering result, therefore we can use NMF as a refinement method. First we perform min-max cut (Mcut), which is a powerful spectral clustering method, and then refine the result via NMF. Finally we should obtain an accurate clustering result. However, NMF often fails to improve the given clustering result. To overcome this problem, we use the Mcut object function to stop the iteration of NMF.

Keywords