Architectures of Convolutional Neural Networks for the Prediction of Protein Secondary Structures

Chi, Sang-Mun;

doi:10.6109/jkiice.2018.22.4.728

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Volume 22 Issue 5
/
Pages.728-733
/
2018
/
2234-4772(pISSN)
/
2288-4165(eISSN)

The Korea Institute of Information and Commucation Engineering (한국정보통신학회)

DOI QR Code

Architectures of Convolutional Neural Networks for the Prediction of Protein Secondary Structures

단백질 이차 구조 예측을 위한 합성곱 신경망의 구조

Chi, Sang-Mun (Department of Computer Science, Kyungsung University)

지상문

Received : 2018.01.28
Accepted : 2018.04.16
Published : 2018.05.31

https://doi.org/10.6109/jkiice.2018.22.4.728 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Deep learning has been actively studied for predicting protein secondary structure based only on the sequence information of the amino acids constituting the protein. In this paper, we compared the performances of the convolutional neural networks of various structures to predict the protein secondary structure. To investigate the optimal depth of the layer of neural network for the prediction of protein secondary structure, the performance according to the number of layers was investigated. We also applied the structure of GoogLeNet and ResNet which constitute building blocks of many image classification methods. These methods extract various features from input data, and smooth the gradient transmission in the learning process even using the deep layer. These architectures of convolutional neural networks were modified to suit the characteristics of protein data to improve performance.

단백질을 구성하는 아미노산의 서열 정보만으로 단백질 이차 구조를 예측하기 위하여 심층 학습이 활발히 연구되고 있다. 본 논문에서는 단백질 이차 구조를 예측하기 위하여 다양한 구조의 합성곱 신경망의 성능을 비교하였다. 단백질 이차 구조의 예측에 적합한 신경망의 층의 깊이를 알아내기 위하여 층의 개수에 따른 성능을 조사하였다. 또한 이미지 분류 분야의 많은 방법들이 기반 하는 GoogLeNet과 ResNet의 구조를 적용하였는데, 이러한 방법은 입력 자료에서 다양한 특성을 추출하거나, 깊은 층을 사용하여도 학습과정에서 그래디언트 전달을 원활하게 한다. 합성곱 신경망의 여러 구조를 단백질 자료의 특성에 적합하게 변경하여 성능을 향상시켰다.

Keywords

References

D. Baker and A. Sali., "Protein structure prediction and structural genomics," Science, vol. 294 no. 5, pp. 93-96, Oct. 2001. https://doi.org/10.1126/science.1065659
H. Lodish, et al., Molecular Cell Biology, 6th ed. New York, NY: W.H. Freeman and Company, 2007
H. W. Buchan, et al., "Scalable web services for the PSIPRED protein analysis workbench," Nucleic Acids Research, vol. 41, W72-W76, Jul. 2013. https://doi.org/10.1093/nar/gks1467
C. N. Magnan and P. Baldi, "SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity," Bioinformatics, vol. 30, no. 18, pp, 2592-2597, Sep. 2014. https://doi.org/10.1093/bioinformatics/btu352
J. Zhou, and O. Troyanskaya, "Deep supervised convolutional generative stochastic network for protein secondary structure prediction," Proceedings of Machine Learning Research, vol. 32, no. 1, pp. 745-753, Jun. 2014.
M. Spencer, J. Eickholt, and J. Cheng, "A deep learning network approach to ab initio protein secondary structure prediction," IEEE/ACM Transactions on Computational Biology Bioinformatics, vol. 12, no. 1, pp. 103-112, Jan/Feb. 2015. https://doi.org/10.1109/TCBB.2014.2343960
S. Wang, et al., "Protein secondary structure prediction using deep convolutional neural fields," Scientific Reports 6, Article number: 18962, Jan. 2016.
Olga Russakovsky, et al., "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, 2015. https://doi.org/10.1007/s11263-015-0816-y
C. Szegedy, et al., "Going deeper with convolution," IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, Jun. 2015.
K. He, et al., "Deep residual learning for image recognition," IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, Jun. 2016.
K. He, et al., "Identity mapping in deep residual networks," European Conference on Computer Vision, pp. 630-645, Sep. 2016.
G. Wang and R.L. Dunbrack "PISCES: a protein sequence culling server," Bioinformatics, vol. 19, no. 12, pp. 1589-1591, 2003. https://doi.org/10.1093/bioinformatics/btg224
W. Kabsch and C. Sander, "Dictionary of protein secondary structure: pattern recognition of hydrohen-bonded and geometrical features," Biopolymers, vol. 22, no. 12, pp. 2577-2637, Dec. 1983. https://doi.org/10.1002/bip.360221211
S. F. Altschul, et al., "Gapped blast and PSI-BLAST: a new generation of protein database search programs," Nucleic Acids Research, vol. 25, no. 17, pp. 3389-3402, Sep. 1997. https://doi.org/10.1093/nar/25.17.3389
B. E. Suzek, et al., "Uniref: comprehensive and non- reduncant uniprot reference clusters," Bioinformatics, vol. 23, no. 10, pp. 1282-1288, May. 2007. https://doi.org/10.1093/bioinformatics/btm098
G. E. Hinton, et al., "Improving neural networks by preventing co-adaptation of feature detectors," [Online]. arXiv:1207.0580, Jul. 2012.
Theano Development Team. "Theano: A Python framework for fast computation of mathematical expressions," [Online]. arXiv:1605.02688, May. 2016.
S.. Dieleman, et al., "Lasagne: First release," [Internet]. Available: http://dx.doi.org/10.5281/zenodo.27878.
J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, vol. 12, pp. 2121-2159, Jul. 2011.
W. Li, et al., "Regularization of neural networks using dropconnect," Proceedings of the 30th International Conference on Machine Learning, Atlanta, USA, vol. 28, no. 3, pp. 1058-1066, Jun. 2013.
D. Ciresan, U. Meier, and J. Schmidhuber, "Multi-column deep neural networks for image classification," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642-3649, Washington, DC, USA, Jun. 2012.
Shin-Hye, et al., "A Comparison of Predicting Movie Success between Artificial Neural Network and Decision Tree", Asia-pacific Journal of Multimedia, vol.7, no.4, pp. 593-602, 2017.
S. Chi, "A Performance Comparison of Protein Profiles for the Prediction of Protein Secondary Structures," Journal of the Korea Institute of Information and Communication Engineering, vol. 22, no. 1, pp. 26-32 Jan. 2018. https://doi.org/10.6109/JKIICE.2018.22.1.26

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Architectures of Convolutional Neural Networks for the Prediction of Protein Secondary Structures

단백질 이차 구조 예측을 위한 합성곱 신경망의 구조

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)