DOI QR코드

DOI QR Code

Protein Secondary Structure Prediction using Multiple Neural Network Likelihood Models

  • Kim, Seong-Gon (CISE department, University of Florida) ;
  • Kim, Yong-Gi (Dept of Computer Science, Gyeongsang National University)
  • Received : 2010.04.05
  • Accepted : 2010.09.30
  • Published : 2010.12.25

Abstract

Predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure is a complex non-linear task that has been approached by several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods. This project introduces a new machine learning method by combining Bayesian Inference with offline trained Multilayered Perceptron (MLP) models as the likelihood for secondary structure prediction of proteins. With varying window sizes of neighboring amino acid information, the information is extracted and passed back and forth between the Neural Net and the Bayesian Inference process until the posterior probability of the secondary structure converges.

Keywords

References

  1. Bishop, C.M., Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995.
  2. Haykin, S., Neural Networks and Learning Machines, Third Edition, Pearson Inc, ISBN-10:0-13-147139-2, 2008.
  3. Bohr, H., Bohr, J., Brunak, S., Cotterill, R.M.J., Lautrup, B., Norskov, L., Olsen, O.H., and Petersen, S. B., “Protein secondary structures and homology by neural networks: The $\alpha$-helices in rhodopsin”, FEBS Letters, 241, pp.223–228, 1988. https://doi.org/10.1016/0014-5793(88)81066-4
  4. Holley, L.H., and Karplus, M., “Protein secondary structure prediction with a neural network”, Proc. Nat. Acad. Sci. USA, 86, pp.152–156, 1989. https://doi.org/10.1073/pnas.86.1.152
  5. Kneller, D.G., Cohen, F.E., and Langridge, R., “Improvements in protein secondary structure prediction by an enhanced neural network”, J. Mol. Biol., 214, pp.171–182, 1990. https://doi.org/10.1016/0022-2836(90)90154-E
  6. Stolorz, P., Lapedes, A., and Xia, Y., “Predicting protein secondary structure using neural net and statistical methods”, J. Mol. Biol., 225, pp.363–377, 1992. https://doi.org/10.1016/0022-2836(92)90927-C
  7. Rost B., and Sander, C., “Improved prediction of protein secondary structure by use of sequence profiles and neural networks”, Proc. Nat. Acad. Sci. USA, 90, pp.7558–7562, 1993. https://doi.org/10.1073/pnas.90.16.7558
  8. Rost B., and Sander, C., “Prediction of protein secondary structure at better than 70% accuracy”, J. Mol. Biol., 232, pp.584–599, 1993. https://doi.org/10.1006/jmbi.1993.1413
  9. Jones, D.T., “Protein secondary structure prediction based on position-specific scoring matrices”, J. Mol. Biol., 292, pp.195–202, 1999. https://doi.org/10.1006/jmbi.1999.3091
  10. Petersen, T.N., Lundegaard, C., Nielsen, M., Bohr, H., Bohr, J., Brunak, S., Gippert, G.P., and Lund, O., “Prediction of protein secondary structure at 80% accuracy”, Proteins, 41, pp.17–20, 2000. https://doi.org/10.1002/1097-0134(20001001)41:1<17::AID-PROT40>3.0.CO;2-F
  11. Zhang, X., Mesirov, J., and Waltz, D., “Hybrid system for protein secondary structure prediction”, J. Mol. Biol., 225, pp.1049–1063, 1992. https://doi.org/10.1016/0022-2836(92)90104-R
  12. Maclin, R., and Shavlik, J., “Using knowledge-based neural networks to improve algorithms: Refining the Chou–Fasman algorithm for protein folding”, Machine Learning, 11, pp.195–215, 1993.
  13. Riis, S.K., and Krogh, A., “Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments”, J. Comput.Biol., 3, pp.163–183, 1996. https://doi.org/10.1089/cmb.1996.3.163
  14. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, L.J., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”, Nucl. Acids Res., 25:3389–3402, 1997. https://doi.org/10.1093/nar/25.17.3389
  15. McGuffin, L.J., Bryson, K., and Jones, J.T., “The PSIPRED protein structure prediction server”, Bioinformatics, 16, pp.404–405, 2000. https://doi.org/10.1093/bioinformatics/16.4.404
  16. Sim, J., Kim, S.-Y., and Lee, J., “PPRODO: prediction of protein domain boundaries using neural network”, Proteins: Structure, Function, and Bioinformatics 59, pp. 627-632, 2005. https://doi.org/10.1002/prot.20442
  17. Kim, H., and Park, H., “Protein secondary structure prediction based on an improved support vector machines approach”, Protein Eng. 16, pp.553-560, 2003. https://doi.org/10.1093/protein/gzg072
  18. Kim, H., and Park, H., “Prediction of protein relative solvent accessibility with support vector machines and long-range interaction 3D local descriptor”, Proteins: Structure, Function, and Bioinformatics 54, pp.557-562, 2004.
  19. Lee, J., Kim, S.-Y., Joo, K., Kim, I., and Lee, J., “Prediction of protein tertiary structure using PROFESY, a novel method based on fragment assembly and conformational space annealing”, Proteins: Structure, Function, and Bioinformatics 56, pp.704-714, 2004. https://doi.org/10.1002/prot.20150
  20. Ginalski, K., et al., “ORFeus: detection of distant homology using sequence profiles and predicted secondary structure,” Nucleic Acids Res. 31, pp.3804-3807, 2003. https://doi.org/10.1093/nar/gkg504
  21. Sim, J., Kim, S.-Y., Lee, J., and Yoo, A., “Predicting the threedimensional structures of proteins: combined alignment approach”, J. Korean Phys. Soc. 44, pp.611-616, 2004. https://doi.org/10.3938/jkps.44.611
  22. Joo, K., Lee, J., Kim, S.-Y., Kim, I., and Lee, S.J., “Profile-based nearest neighbor method for pattern recognition”, J. Korean Phys. Soc. 44, pp.599-604, 2004. https://doi.org/10.3938/jkps.44.599
  23. Joo, K., Kim, I., Kim, S.-Y. Lee, J., and Lee, S.J., “Prediction of the secondary structures of proteins by using PREDICT, a nearest neighbor method on pattern space”, J. Korean Phys. Soc. 45, pp.1441-1449, 2004.
  24. Qian, N., and Sejnowski, T.J., “Predicting the secondary structure of globular proteins using neural network models”, J. Mol. Biol., 202, pp.865–884, 1988. https://doi.org/10.1016/0022-2836(88)90564-5
  25. Chou, P.Y., and Fasman, G.D., “Prediction of the secondary structure of proteins from their amino acid sequence”, Adv. Enzymol. Relat. Areas Mol. Biol., 47, pp.45–148, 1978.
  26. Protein Data Bank (PDB): http://www.rcsb.or

Cited by

  1. Position coordinate representation of flying arrow and analysis of its performance indicator vol.14, pp.4, 2016, https://doi.org/10.1007/s12555-015-0191-z