Biological Network Evolution Hypothesis Applied to Protein Structural Interactome

  • Bolser, Dan M. (MRC-DUNN Human Nutrition Unit, Hills Road, Cambridge, CB2 2XY, England) ;
  • Park, Jong Hwa (MRC-DUNN Human Nutrition Unit, Hills Road, Cambridge, CB2 2XY, England, Object Interaction Technologies Inc. (OITEK))
  • Published : 2003.09.01

Abstract

The latest measure of the relative evolutionary age of protein structure families was applied (based on taxonomic diversity) using the protein structural interactome map (PSIMAP). It confirms that, in general, protein domains, which are hubs in this interaction network, are older than protein domains with fewer interaction partners. We apply a hypothesis of 'biological network evolution' to explain the positive correlation between interaction and age. It agrees to the previous suggestions that proteins have acquired an increasing number of interaction partners over time via the stepwise addition of new interactions. This hypothesis is shown to be consistent with the scale-free interaction network topologies proposed by other groups. Closely co-evolved structural interaction and the dynamics of network evolution are used to explain the highly conserved core of protein interaction pathways, which exist across all divisions of life.

Keywords

References

  1. Alexandrov, N. N. & Go, N. (1995). Biological meaning, statistical significance, and classification of local spatial similarities in nonhomologous proteins. Protein Sci. 3, 866-875 https://doi.org/10.1002/pro.5560030601
  2. Altschul, S. F., Madden, T. L., Schaffer, A .A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J., (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402 https://doi.org/10.1093/nar/25.17.3389
  3. Anantharaman, V., Koonin, E. V. & Aravind, L. (2001). Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains. J. Mol. Biol., 307, 1271-1292 https://doi.org/10.1006/jmbi.2001.4508
  4. Apic, G., Gough, J. & Teichmann, S. A. (2001a). An Insight into Domain Combinations. Bioinformatics. 17, 83S-89S https://doi.org/10.1093/bioinformatics/17.1.83
  5. Apic, G., Gough, J. & Teichmann, S. A. (2001b). Domain Combinations in Archaeal, Eubacterial and Eukaryotic Proteomes. J. Mol. Biol. 310, 311-325 https://doi.org/10.1006/jmbi.2001.4776
  6. Bairoch A. & Apweiler R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45-48 https://doi.org/10.1093/nar/28.1.45
  7. Bairoch A. (2000). The ENZYME database in 2000. Nucleic Acids Res. 28, 304-305 https://doi.org/10.1093/nar/28.1.304
  8. Barabasi, A. & Albert, R. (1999). Emergence of Scaling in Random Networks. Science, 286, 509-512 https://doi.org/10.1126/science.286.5439.509
  9. Benner, S. A., Ellington, A. D. & Tauer, A. (1989). Modern metabolism as a palimpsest of the RNA world. Proc. Natl Acad. Sci. USA, 86, 7054-7058 https://doi.org/10.1073/pnas.86.18.7054
  10. Bennett, M. J., Choe, S. & Eisenberg. D. (1994). Domain swapping: Entangling alliances between proteins. Proc. Natl. Acad. Sci. U.S.A. 91, 3127-3131 https://doi.org/10.1073/pnas.91.8.3127
  11. Benton, BK., Tinkelenberg, Gonzalez, I. and Cross, FR. (1997). Cla4p, a Saccharomyces- Cerevisiae Cdc42p-activated kinase involved in cytokinesis is activated at mitosis. Mol. Cell. Bioi. 17, 5067-5076 https://doi.org/10.1128/MCB.17.9.5067
  12. Berman, H. M.,Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P.E. (2000). The Protein Data Bank. Nucl. Acids Res., 28, 235-241 https://doi.org/10.1093/nar/28.1.235
  13. Cirera, S, and Aguade M. (1998) Molecular evolution of a duplication: the sex-peptide (Acp70A) gene region of Drosophila subobscura and Drosophila madeirensis. Mol. BioI. Evol. 15, 988-996 https://doi.org/10.1093/oxfordjournals.molbev.a026014
  14. Chothia, C. (1992). One thousand families for the molecular biologist. Nature, 357, 543-544 https://doi.org/10.1038/357543a0
  15. Dandekar, T., Snel, B., Huynen, M. & Bork, P. (1998). Conservation of gene order: a finger-print of proteins that physically interact. Trends Biochem. Sci. 23, 324-328 https://doi.org/10.1016/S0968-0004(98)01274-2
  16. Doolittle, R. F.(1999). Do you dig my groove? Nat Genet, 23, 6-8
  17. Dorogovtsev, S. N. & Mendes, J. F. F. (2000). Evolution of reference networks with ageing. http://xxx.lanl.gov/abs/condmat/0001419
  18. Dorogovtsev, S. N. & Mendes, J. F. F. (2001). Evolution of networks. http://xxx.lanl.gov/abs/cond-mat/0106144
  19. Enright, A. J. & Ouzounis, C.A. (2001). Functional associations of proteins in entire genomes by means of exhaustive detection of gene fusion. Genome Biology, 2(9), research0034.1-7
  20. Enright, A. J., Iliopoulos, I., Kyrpides, N. C. & Ouzounis, C. A. (1999). Protein interaction maps for complete genomes based on gene fusion events. Nature, 402, 86-90 https://doi.org/10.1038/47056
  21. Fraser, H. B., Hirsh, A. E., Steinmetz, L. M., Scharfe, C. & Feldman, M. W. (2002). Evolutionary Rate in the Protein Interaction Network. Science, 296, 750-752 https://doi.org/10.1126/science.1068696
  22. Gough, J., Karplus, K., Hughey, R. & Chothia, C. (2001). Assignment of homology to genomes sequences using a library of hidden Markov models that represent all proteins of known structure. J. Mol. BioI., 313, 903-919 https://doi.org/10.1006/jmbi.2001.5080
  23. Henrick, K & Thornton, J. M. (1998). PQS: a protein quaternary structure file server. Trends Biochem. Sci.23(9), 358-361 https://doi.org/10.1016/S0968-0004(98)01253-5
  24. Huynen, M. A., Dandekar, T. & Bork, P. (1999). Variation and evolution of the citric acid cycle: a genomic perspective. Trends Microbiol. 7, 281-291 https://doi.org/10.1016/S0966-842X(99)01539-5
  25. Huynen, M., Snel, B., Lathe, W. & Bork P. (2000). Predicting Protein Function by Genomic Context: Quantitative Evaluation and Qualitative Inferences. Genome Res. 10, 1204-1210 https://doi.org/10.1101/gr.10.8.1204
  26. Jeong, H., Mason, S., Barabasi, A. & Oltvai, Z. (2001). Lethality and centrality in protein networks. Nature, 411 , 41-42 https://doi.org/10.1038/35075138
  27. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. (2000). The large-scale organization of metabolic networks. Nature, 407, 651-654 https://doi.org/10.1038/35036627
  28. Jones, S., Marin, A. & Thornton, J. M. (2000). Protein domain interfaces: characterization and comparison with oligomeric protein interfaces. Protein Eng. 13, 77-82 https://doi.org/10.1093/protein/13.2.77
  29. Ju, BH, Park, B, Park, JH, and Han, K, (2003) Visualization and analysis of protein interactions. Bioinformatics 2003, 19, 317-318 https://doi.org/10.1093/bioinformatics/19.2.317
  30. Kauffman, SA. (1993). The Origins of Order, New York, Oxford, Oxford University Press, pp. 39-67
  31. Long, M and Langley, CH. (1993). Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science, 260, 91-95 https://doi.org/10.1126/science.7682012
  32. Marcotte, E.M., Pellegrini, M., Ng, H., Rice, D. W., Yeates, T. O. & Eisenberg, D. (1999). Detecting Protein Function and Protein-Protein Interactions from Genome Sequences. Science, 285, 751-753 https://doi.org/10.1126/science.285.5428.751
  33. Miller, S. (1989). The structure of interfaces between subunits of dimeric and tetrameric proteins. Protein Eng 3, 77-83 https://doi.org/10.1093/protein/3.2.77
  34. Morowitz, H. J. (1992). Beginnings of cellular life: metabolism recapitulates biogenesis, New Haven, Yale University Press
  35. Morowitz, H. J. (1999). A theory of biochemical organization, metabolic pathways and evolution. Complexity, 4, 39
  36. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. BioI. 247, 536-540
  37. Orengo, C. A., Jones, D. T., & Thornton, J. M. (1994). Protein superfamilies and domain superfolds. Nature, 372, 631-634 https://doi.org/10.1038/372631a0
  38. Overbeek, R., Fonstien, M., D'Souza, M., Pusch, G. & Maltsev, N. (1999). The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. U.S.A. 96, 2896- 2901 https://doi.org/10.1073/pnas.96.6.2896
  39. Park, J., Lappe, M. & Teichmann, S. A. (2001). Mapping Protein Family Interactions: Intramolecular and Intermolecular Protein Family Interaction Repertoires in the PDB and Yeast. J. Mol. Biol.307, 929-938 https://doi.org/10.1006/jmbi.2001.4526
  40. Park, J and Bolser, D, (2001). Conservation of protein interaction network in evolution. Genome Informatics, 12, 135-140
  41. Park, J., Teichmann, S. A., Hubbard, T., and Chothia, C., (1997) Intermediate sequences increase the detection of distant sequence homologies. J. Mol.Biol. 273, 349-354 https://doi.org/10.1006/jmbi.1997.1288
  42. Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T and Chothia, C (1998). Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mol. BioI., 284, 1201-1210 https://doi.org/10.1006/jmbi.1998.2221
  43. Ponting, C. P., Aravind, L., Schultz, J., Bork, P. & Koonin, E. V. (1999). Eukaryotic signaling domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer. J. Mol. BioI. 289, 729-745 https://doi.org/10.1006/jmbi.1999.2827
  44. Puniyani, A. R. & Lukose, R. M. (2001). Growing random networks under constraints. http://xxx.lanl.gov/abs/condmat/0107391
  45. Shindyalov I. N. & Bourne P. E. (1998). Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering, 11(9) 739-747 https://doi.org/10.1093/protein/11.9.739
  46. Snel B., Bork P. & Huynen, M.A. Genomes in flux: The evolution of archaeal and proteobacterial genecontent. (2002). Genome Res. 12, 17-25 https://doi.org/10.1101/gr.176501
  47. Teichmann, SA, Chothia, C., Church, GM., and Park, J. (2000) Fast assignment of protein structures to sequences using the intermediate sequence library PDB-ISL. Bioinformatics, 16, 117-124 https://doi.org/10.1093/bioinformatics/16.2.117
  48. Tsai, C., Lin, S. L., Wolfson, H. J. & Nussinov, R. (1996). A Dataset of Protein-Protein Interfaces Generated with a Sequence-order-independent Comparison Technique. J. Mol. Biol., 260, 604-620 https://doi.org/10.1006/jmbi.1996.0424
  49. Tsaur SC, Ting CT, and Wu CI. (1998) Positive selection driving the evolution of a gene of male reproduction, Acp26Aa, of Drosophila: II. Divergence versus polymorphism. Mol. BioI. Evol. 15, 1040-1046 https://doi.org/10.1093/oxfordjournals.molbev.a026002
  50. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P, QureshiEmili A, Li Y, Godwin B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M, Fields S, and Rothberg JM. (2000), A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 403, 623-627 https://doi.org/10.1038/35001009
  51. Wagner, A. (2001). The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. Mol. Biol. Evol. 18, 1283-1292 https://doi.org/10.1093/oxfordjournals.molbev.a003913
  52. Wagner, A. & Fell, D. A. (2001). The small world inside large metabolic networks. Proc. R. Soc. Lond. 268, 1803-1810 https://doi.org/10.1098/rspb.2001.1711
  53. Wang, Z. X. (1996). How many fold types of protein are there in nature? Proteins. 26. 186-191 https://doi.org/10.1002/(SICI)1097-0134(199610)26:2<186::AID-PROT8>3.0.CO;2-E
  54. Watts, D. J. & Strogatz, S. H. (1998). Collective dynamics of 'small-world' networks. Nature, 393, 440-442 https://doi.org/10.1038/30918
  55. Wheeler, D. L., Chappey, C., Lash, A. E., Leipe, D. D., Madden, T. L., Schuler, G. D., Tatusova, T. A., & Rapp, B. A. (2000). Database resources of the National Center for Biotechnology Information. Nucl. Acids Res. 28, 10-14. (http://www.ncbi.nlm.nih.gov/ Taxonomy/taxonomy home.html/index.cgi) https://doi.org/10.1093/nar/28.1.10
  56. Zhang, C. T. (1997). Relations of the numbers of protein sequences, families and folds. Protein Engineering, 10, 757-761 https://doi.org/10.1093/protein/10.7.757