Browse > Article
http://dx.doi.org/10.6109/jkiice.2018.22.1.26

A Performance Comparison of Protein Profiles for the Prediction of Protein Secondary Structures  

Chi, Sang-Mun (Department of Computer Science, Kyungsung University)
Abstract
The protein secondary structures are important information for studying the evolution, structure and function of proteins. Recently, deep learning methods have been actively applied to predict the secondary structure of proteins using only protein sequence information. In these methods, widely used input features are protein profiles transformed from protein sequences. In this paper, to obtain an effective protein profiles, protein profiles were constructed using protein sequence search methods such as PSI-BLAST and HHblits. We adjust the similarity threshold for determining the homologous protein sequence used in constructing the protein profile and the number of iterations of the profile construction using the homologous sequence information. We used the protein profiles as inputs to convolutional neural networks and recurrent neural networks to predict the secondary structures. The protein profile that was created by adding evolutionary information only once was effective.
Keywords
Protein secondary structure; Protein profile; Protein sequence search; PSI-BLAST; HHblits;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Remmert, A. Biegert, and J. Soding, "HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment," Nature Methods, vol. 9, pp. 173-175, Dec. 2011.
2 A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," Proceeding of International Conference on Acoustics, Speech and Signal Processing, Vancouver, Cananda, May 2013.
3 A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM and other neural network architectures," Neural Networks, Vancouver, Canada, May 2013.
4 B. E. Suzek, et al., "Uniref: comprehensive and non-reduncant uniprot reference clusters," Bioinformatics, vol. 23, pp. 1282-1288, May 2007.   DOI
5 The UniProt Consortium, "UniProt: the universal protein knowledgebase," Nucleic Acids Research, vol. 45, D158-D169, Jan. 2017.   DOI
6 G. Wang and R.L. Dunbrack "PISCES: a protein sequence culling server," Bioinformatics, vol. 19, pp. 1589-1591, Aug. 2003.   DOI
7 W. Kabsch and C. Sander, "Dictionary of protein secondary structure: pattern recognition of hydrohen-bonded and geometrical features," Biopolymers, vol. 22, pp. 2577-2637, Dec. 1983.   DOI
8 Theano Development Team. "Theano: A Python framework for fast computation of mathematical expressions," arXiv e-prints, 1605.02688, May 2016.
9 S.. Dieleman, et al., "Lasagne: First release," DOI:10.5281/zenodo.27878, http://dx.doi.org/10.5281/zenodo.27878, Aug. 2015.   DOI
10 J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, vol. 12, pp. 2121-2159, Jul. 2011.
11 D. Baker and A. Sali., "Protein structure prediction and structural genomics," Science, vol. 294, pp. 93-96, Oct. 2001.   DOI
12 H. Lodish, et al., Molecular Cell Biology, sixth Ed., W.H. Freeman and Company, New York, 2007.
13 H. W. Buchan, et al., "Scalable web services for the PSIPRED protein analysis workbench," Nucleic Acids Research, vol. 41, W72-W76, Jul. 2013.   DOI
14 J. Schimidhuber, "Deep learning in neural networks: An overview," Neural Networks, vol. 61, pp. 85-117, Jan, 2015.   DOI
15 J. Zhou, and O. Troyanskaya, "Deep supervised convolutional generative stochastic network for protein secondary structure prediction," Journal of Machine Learning Research W&CP, vol. 32, pp. 745-753, Jun. 2014.
16 M. Spencer, J. Eickholt, and J. Cheng, "A deep learning network approach to ab initio protein secondary structure prediction," IEEE/ACM Transactions on Computational Biology Bioinformatics, 12, pp. 103-112, Jan/Feb. 2015.   DOI
17 S. Wang, et al., "Protein secondary structure prediction using deep convolutional neural fields," Scientific Reports 6, Article number: 18962, Jan. 2016.
18 Y. LeCunn, Y. Bengio, and G. Hinton, "Deep Learning," Nature, vol. 521, pp. 436-444. May 2015.   DOI
19 O. Abdel-Hamid, et al., "Convolutional Neural Networks for Speech Recognition". IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 22, no. 10. pp. 1533-1545, Jul. 2014.   DOI
20 A. Graves, et al., "Generating sequences with recurrent neural networks," arXiv preprint 1308.0850, Jun. 2014.
21 C. N. Magnan and P. Baldi, "SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity," Bioinformatics, vol. 30, pp, 2592-2597, Sep. 2014.   DOI
22 C. Kyunghyun, et al., "On the properties of neural machine translation: Encoder-decoder approaches," arXiv preprint 1409.1259, Oct. 2014.
23 S. F. Altschul, et al., "Gapped blast and PSI-BLAST: a new generation of protein database search programs," Nucleic Acids Research, vol. 25, pp. 3389-3402, Sep. 1997.   DOI