Consistent Triplets of Candidate Paralogs by Graph Clustering

  • Yun, Hwa-Seob (Department of Computer Science, Rutgers University) ;
  • Muchnik, Ilya (Department of Computer Science, Rutgers University, DIMACS, Rutgers University) ;
  • Kulikowski, Casimir (Department of Computer Science, Rutgers University)
  • Published : 2005.09.22

Abstract

We introduce a fully automatic clustering method to classier candidate paralog clusters from a set of protein sequences within one genome. A set of protein sequences is represented as a set of nodes, each represented by the amino acid sequence for a protein with the sequence similarities among them constituting a set of edges in a graph of protein relationships. We use graph-based clustering methods to identify structurally consistent sets of nodes which are strongly connected with each other. Our results are consistent with those from current leading systems such as COG/KOG and KEGG based on manual curation. All the results are viewable at http://www.cs.rutgers.edu/${\sim}$seabee.

Keywords