Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2006.13D.2.207

Automatic Orthologous-Protein-Clustering from Multiple Complete-Genomes by the Best Reciprocal BLAST Hits  

Kim Sun-Shin (충북대학교 대학원 전자계산학과)
Rhee Chung-Sei (충북대학교 컴퓨터과학과)
Ryu Keun-Ho (충북대학교 컴퓨터과학과)
Abstract
Though the number of completely sequenced genomes quickly grows in recent years, the methods to predict protein functions by homology from the genomes have not been used sufficiently. It has been a successful technique to construct an OPCs(Orthologous Protein Clusters) with the best reciprocal BLAST hits from multiple complete-genomes. But it takes time-consuming-processes to make the OPCs with manual work. We, here, propose an automatic method that clusters OPs(Orthologous Proteins) from multiple complete-genomes, which is, to be extended, based on INPARANOID which is an automatic program to detect OPs between two complete-genomes. We also Prove all possible clustering mathematically.
Keywords
Genome; Protein Function; Orthologous Group; Automatic Method; Cluster;
Citations & Related Records
연도 인용수 순위
  • Reference
1 G. Michael and A. H. Clyde, 'Gene content phylogeny of herpesviruses', PNAS, Vol.98, No.10, May, 9, 2000   DOI
2 J. M. Stuart, et al., 'A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules', Science, Vol.302, Oct., 10, 2003   DOI   ScienceOn
3 H. Bono, et al., 'Systematic Prediction of Orthologous Units of Genes in the Complete Genomes', Genome Inform Ser Workshop Genome Inform., Vol.9, pp.32-40, 1998
4 R. Maido et al., 'Automatic Clustering of Orthologs and in-paralogs from Pairwise Species Comparisons', J Mol. Biol., Vol.314, pp.1041-1052, 2001   DOI   ScienceOn
5 A. R. Mushegian, et al., 'Large-scale taxonomic profiling of eukaryotic model organisms: a comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes', Genome Res., Vol.8, pp.590-598, 1998
6 M. Y. Galperin & E.V. Koonin, 'Sources of systematic error in functional annotation of genomes : domain rearrangement, nonorthologous gene displacement and operon disruption', In Silico Biol., Vol.1, pp.55-67, 1998
7 S. Kimmen, 'Phylogenonmic inference of protein molecular function: advances and challenges', Bioinformatics, Vol.20, No.2, pp.170-179, 2004   DOI   ScienceOn
8 J.A. Eisen, 'Phylogenomics:improving functional predictions for uncharacterized genes by evolutionary analysis', Genome Res., Vol.8, pp.163-167, 1998   DOI   ScienceOn
9 M Kanehisa & B. Peer, 'Bioinformatics in the posts-equences era', nature genetics supplement, Vol.33, pp.305-310, 2003   DOI   ScienceOn
10 P. Bork & E.V. Koonin, 'Predicting functions from protein sequences-where are the bottlenecks?', Nat. Genet., Vol. 18, pp.313-318, 1998   DOI   ScienceOn
11 R.L. Tatusov, et al., 'The COG database: an updated version includes eukaryotes', BMC Bioinformatics, Vol.4(1), No.41, Sep., 11, 2003   DOI   ScienceOn
12 G. M. Rubin, et al., 'Comparative genomics of the eukaryotes', Science, Vol.287, pp.2204-2215, 2000   DOI   ScienceOn
13 S. J. Wheelan, et al., 'Human and nematode orthologs-lessons from the analysis of 1800 human genes and the proteome of Caenorhabditis elegans', Gene, Vol.238, pp.163-170, 1999   DOI   ScienceOn
14 S.F. Altschul, et al., 'Basic local alignment search tool', J. Mol. Biol., Vol.215, pp.403-410, 1990   DOI
15 S. A. Chervitz, et al., 'Comparison of the complete protein set of worm and yeast.orthology and divergence', Science, Vol.282, pp.2022-2028, 1998   DOI   ScienceOn
16 W. M. Fitch, 'Distinguishing homologous from analogous proteins', Syst. Zool., Vol.19, pp.99-113, 1970   DOI   ScienceOn
17 R.L. Tatusov, et al., 'A genomic perspective on protein families', Science, Vol.278(5338), pp.631-637, Oct., 24, 1997   DOI   ScienceOn
18 L. Roman, et al., 'The COG database: a tool for genome-scale analysis of protein functions and evolution', Nucleic Acids Research, Vol.28(1), pp.33-36, 2000   DOI