A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training |
Tang, Zhan
(College of Information and Electrical Engineering, China Agricultural University)
Guo, Xuchao (College of Information and Electrical Engineering, China Agricultural University) Bai, Zhao (College of Information and Electrical Engineering, China Agricultural University) Diao, Lei (College of Information and Electrical Engineering, China Agricultural University) Lu, Shuhan (School of Information, University of Michigan) Li, Lin (College of Information and Electrical Engineering, China Agricultural University) |
1 | S. Pyysalo, F. Ginter, J. Heimonen, J. BjoRne, J. Boberg, J. Rvinen and T. Salakoski, "BioInfer: a corpus for information extraction in the biomedical domain," BMC Bioinformatics, vol. 8, 2007. |
2 | C. Nedellec, ''Learning language in logic-genic interaction extraction challenge,'' in Proc. of Learn. Lang. Logic Workshop, pp. 1-7, 2005. |
3 | S. Pyysalo, A. Airola, J. Heimonen, J. Bjorne, F. Ginter and T. Salakoski, "Comparative analysis of five protein-protein interaction corpora," BMC Bioinformatics, vol. 9, Article no. S6, 2008. |
4 | C. Quan, L. Hua, X. Sun and W. Bai, "Multichannel Convolutional Neural Network for Biological Relation Extraction," Biomed Res Int, vol. 2016, no. 1850404, 2016. |
5 | W. A. Baumgartner, Z. Lu, H. L. Johnson, J. G. Caporaso, J. Paquette, A. Lindemann, E. K. White, O. Medvedeva, K. B. Cohen and L. Hunter, "Concept recognition for extracting protein interaction relations from biomedical text," Genome Biology, vol. 9, Article no. S9, 2008. |
6 | G. Murugesan, S. Abdulkadhar and J. Natarajan, "Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature," PLOS ONE, vol. 12, pp. e0187379, 2017. DOI |
7 | Z. H. Zhao, Z. H. Yang, H. F. Lin, J. Wang and S. Gao, "A protein-protein interaction extraction approach based on deep neural network," Int J Data Min Bioinform, vol. 15, pp. 145-164, 2016. DOI |
8 | M. Jian, K. M. Lam, J. Dong, et al, "Visual-Patch-Attention-Aware Saliency Detection," IEEE Transactions on Cybernetics, vol. 45(8), pp.1575-1586, 2015. DOI |
9 | Y. Peng and Z. lu, "Deep learning for extracting protein-protein interactions from biomedical literature," in Proc. of The BioNLP 2017 workshop, pp. 29-38, 2017. |
10 | S. P. Choi, "Extraction of protein-protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings," J Inf Sci, vol. 44, pp. 60-73, 2018. DOI |
11 | L. Hua and C. Quan, "A Shortest Dependency Path Based Convolutional Neural Network for Protein-Protein Relation Extraction," Biomed Res Int, vol. 2016, no. 8479587, 2016. |
12 | D. Kwon, J. H. Yoon, S.-Y. Shin, T.-H. Jang, H.-G. Kim, I. So, J.-H. Jeon and H. H. Park, "A comprehensive manually curated protein-protein interaction database for the Death Domain superfamily," Nucleic Acids Research, vol. 40, pp. D331-D336, 2012. DOI |
13 | D. E. Gordon, G. M. Jang, M. Bouhaddou, J. W. Xu, K. Obernier, K. M. White, M. J. O'Meara, V. V. Rezelj, J. F. Z. Guo, D. L. Swaney et al, "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing," Nature, vol. 583, pp. 459-468, 2020. DOI |
14 | A. Airola, S. Pyysalo, J. Bjorne, T. Pahikkala, F. Ginter and T. Salakoski, "All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning," BMC Bioinformatics, vol. 9, 2008, Article no. S2. |
15 | N. Warikoo, Y.-C. Chang and W.-L. Hsu, "LBERT: Lexically aware Transformer-based Bidirectional Encoder Representation model for learning universal bio-entity relations," Bioinformatics, vol. 37, pp. 404-412, 2021. DOI |
16 | J. Devlin, M.-W. Chang, K. Lee and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," in Proc. of NAACL, Minneapolis, Minnesota, USA, pp. 4171-4186, 2019. |
17 | H. Yang, J. Yuan, C. Li, G. Zhao, Z. Sun, Q. Yao, B. Bao, A. V. Vasilakos and J. Zhang, "BrainIoT: Brain-Like Productive Services Provisioning with Federated Learning in Industrial IoT," IEEE Internet of Things Journal, vol. 9, pp. 2014-2024, 2022. DOI |
18 | H. Zhang, R. C. Guan, F. F. Zhou, Y. C. Liang, Z. H. Zhan, L. Huang and X. Y. Feng, "Deep Residual Convolutional Neural Network for Protein-Protein interaction Extraction," Ieee Access, vol. 7, pp. 89354-89365, 2019. DOI |
19 | Y. L. Hsieh, Y. C. Chang, N. W. Chang and W. L. Hsu, "Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory," in Proc. of The 8th IJCNLP, pp. 240-245, 2017. |
20 | M. Ahmed, J. Islam, M. R. Samee, and R. E. Mercer, "Identifying Protein-Protein Interaction using Tree LSTM and Structured Attention," in Proc. of IEEE 13th ICSC, 2019. |
21 | J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So and J. Kang, "BioBERT: a pre-trained biomedical language representation model for biomedical text mining," Bioinformatics, vol. 36, pp. 1234-1240, 2020. DOI |
22 | A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser and I. Polosukhin, "Attention is all you need," in Proc. of 31st ICNIPS, Long Beach, California, USA, pp. 6000-6010, 2017. |
23 | J. Pennington, R. Socher and C. Manning, "Glove: Global Vectors for Word Representation," in Proc. of EMNLP, 2014. |
24 | M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee and L. Zettlemoyer, "Deep Contextualized Word Representations," in Proc. of NAACL, New Orleans, Louisiana, USA, pp. 2227-2237, 2019. |
25 | T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality," in Proc. of NIPS, 2013. |
26 | R. Alec, N. Karthik, S. Tim and S. Ilya, "Improving Language Understanding by Generative Pre-Training," in Proc. of NLPIR, 2019. |
27 | T. Miyato, A. M. Dai and I. Goodfellow, "Adversarial Training Methods for Semi-Supervised Text Classification," in Proc. of ICLR, 2017. |
28 | M. Jian, J. Wang, H. Yu, et al, "Visual saliency detection by integrating spatial position prior of object with background cues," Expert Systems with Applications, vol. 168(11), pp. 114219, 2020. |
29 | M. Jian, W. Zhang, H. Yu, et al, "Saliency Detection Based on Directional Patches Extraction and Principal Local Color Contrast," Journal of Visual Communication and Image Representation, vol.57, pp. 1-11, 2018. DOI |
30 | Y.-C. Chang, C.-H. Chu, Y.-C. Su, C. C. Chen and W.-L. Hsu, "PIPE: a protein-protein interaction passage extraction module for BioCreative challenge," Database, vol. 2016, no. baw101, 2016. |
31 | J. Howard and S. Ruder, "Universal Language Model Fine-tuning for Text Classification," in Proc. of ACL, Melbourne, Australia, pp. 328-339, 2018. |
32 | C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow and R. Fergus, "Intriguing properties of neural networks," in Proc. of ICLR, 2014. |
33 | J. Goodfellow, J. Shlens and C. Szegedy, "Explaining and Harnessing Adversarial Examples," in Proc. of ICLR, 2015. |
34 | A. Madry, A. Makelov, L. Schmidt, D. Tsipras and A. Vladu, "Towards Deep Learning Models Resistant to Adversarial Attacks," in Proc. of ICLR, 2018. |
35 | L. Ba, J. R. Kiros and G. E. Hinton, "Layer Normalization," arxiv, 2016. |
36 | M. Jian, Qi. Q ,J. Dong, et al, "Integrating QDWD with pattern distinctness and local contrast for underwater saliency detection," Journal of Visual Communication and Image Representation, vol. 53, pp. 31-41, 2018. DOI |
37 | R. Bunescu, R. Ge, R. J. Kate, E. M. Marcotte, R. J. Mooney, A. K. Ramani and Y. W. Wong, "Comparative experiments on learning information extractors for proteins and their interactions," Artificial Intelligence in Medicine, vol. 33, pp. 139-155, 2005. DOI |
38 | Fundel, R. Kueffner and R. Zimmer, "RelEx - Relation extraction using dependency parse trees," Bioinformatics, vol. 23, pp. 365-371, 2007. DOI |
39 | D. B. A'D, J. Da and D. Nb, "Mining Medline: Abstracts, Sentences, Or Phrases?," in Proc. of Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing, vol. 7, pp. 326-337, 2002. |
40 | T. Yu, R. Jin, X. Han, J. Li and T. Yu, "Review of Pre-training Models for Natural Language Processing," CEA, vol. 56, no. 23, pp. 12-22, 2020. |
41 | M. Altmann, S. Altmann, P. A. Rodriguez, B. Weller, L. E. Vergara, J. Palme, N. M. de la Rosa, M. Sauer, M. Wenig, J. A. Villaecija-Aguilar et al, "Extensive signal integration by the phytohormone protein network," Nature, vol. 583, pp. 271-276, 2020. DOI |
42 | S. Yadav, A. Ekbal, S. Saha, A. Kumar and P. Bhattacharyya, "Feature assisted stacked attentive shortest dependency path based Bi-LSTM model for protein-protein interaction," Knowledge-Based Syst, vol. 166, pp. 18-29, 2019. DOI |
43 | K. Yu, P.-Y. Lung, T. Zhao, P. Zhao, Y.-Y. Tseng and J. Zhang, "Automatic extraction of protein-protein interactions using grammatical relationship graph," BMC Medical Informatics and Decision Making, vol. 18, 2018. |
![]() |