Browse > Article
http://dx.doi.org/10.3745/KTSDE.2022.11.1.11

De Novo Drug Design Using Self-Attention Based Variational Autoencoder  

Piao, Shengmin (연세대학교 컴퓨터과학과)
Choi, Jonghwan (연세대학교 컴퓨터과학과)
Seo, Sangmin (연세대학교 컴퓨터과학과)
Kim, Kyeonghun (연세대학교 컴퓨터과학과)
Park, Sanghyun (연세대학교 컴퓨터과학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.11, no.1, 2022 , pp. 11-18 More about this Journal
Abstract
De novo drug design is the process of developing new drugs that can interact with biological targets such as protein receptors. Traditional process of de novo drug design consists of drug candidate discovery and drug development, but it requires a long time of more than 10 years to develop a new drug. Deep learning-based methods are being studied to shorten this period and efficiently find chemical compounds for new drug candidates. Many existing deep learning-based drug design models utilize recurrent neural networks to generate a chemical entity represented by SMILES strings, but due to the disadvantages of the recurrent networks, such as slow training speed and poor understanding of complex molecular formula rules, there is room for improvement. To overcome these shortcomings, we propose a deep learning model for SMILES string generation using variational autoencoders with self-attention mechanism. Our proposed model decreased the training time by 1/26 compared to the latest drug design model, as well as generated valid SMILES more effectively.
Keywords
De Novo Drug Design; SMILES; Deep Learning; Self-attention; Variational Autoencoder;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 A. Vaswani, et al, "Attention is all you need," arXiv preprint arXiv:1706.03762, 2017.
2 J. An and S. Cho, "Variational autoencoder based anomaly detection using reconstruction probability," Special Lecture on IE, Vol.2, No.1, pp.1-18, 2015.
3 D. P. Kingma, S. Mohamed, D. J. Rezende, and M. Welling, "Semi-supervised learning with deep generative models," In Advances in Neural Information Processing Systems, pp.3581-3589, 2014.
4 J. L. Ba, J. R. Kiros, and G. E. Hinton, "Layer normalization," arXiv preprint arXiv:1607.06450, 2016.
5 D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
6 Bento, A. Patricia, et al., "The ChEMBL bioactivity database: An update," Nucleic Acids Research, Vol.42, No.D1, pp.D1083-D1090, 2014.   DOI
7 A. Gaulton, et al., "ChEMBL: A large-scale bioactivity database for drug discovery," Nucleic Acids Research, Vol.40, No.D1, pp.D1100-D1107, 2012.   DOI
8 G. Maggiora, M. Vogt, D. Stumpfe, and J. Bajorath, "Molecular similarity in medicinal chemistry: Miniperspective," Journal of Medicinal Chemistry, Vol.57, No.8, pp.3186-3204, 2014.   DOI
9 S. K. Jain and A. Agrawal, "De novo drug design: An overview," Indian Journal of Pharmaceutical Sciences, Vol.66, No.6, pp.721, 2004.
10 D. E., Patterson, R. D. Cramer, A. M. Ferguson, R. D. Clark, and L. E. Weinberger, "Neighborhood behavior: A useful concept for validation of 'molecular diversity' descriptors," Journal of Medicinal Chemistry, Vol.39, No.16, pp.3049-3059, 1996.   DOI
11 A. Zhavoronkov, et al., "Deep learning enables rapid identification of potent DDR1 kinase inhibitors," Nature Biotechnology, Vol.37, No.9, pp.1038-1040, 2019.   DOI
12 S. Mohammadi, B. O'Dowd, C. Paulitz-Erdmann, and L. Goerlitz, "Penalized variational autoencoder for molecular design," ChemRxiv. 10.26434/chemrxiv. 7977131,v2, 2021.
13 D. Weininger, "SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules," Journal of Chemical Information and Computer Sciences, Vol.28, No.1, pp.31-36, 1988.   DOI
14 D. Weininger, A. Weininger, and J. L. Weininger, "SMILES. 2. Algorithm for generation of unique SMILES notation," Journal of Chemical Information and Computer Sciences, Vol.29, No.2, pp.97-101, 1989.   DOI
15 D. Weininger, "SMILES. 3. DEPICT. Graphical depiction of chemical structures," Journal of Chemical Information and Computer Sciences, Vol.30, No.3, pp.237-243, 1990.   DOI
16 S. R. Heller, A. McNaught, I. Pletnev, S. Stein, and D. Tchekhovskoi, "InChI, the IUPAC international chemical identifier," Journal of Cheminformatics, Vol.7, No.1, pp.1-34, 2015.   DOI
17 A. Joulin and T. Mikolov, "Inferring algorithmic patterns with stack-augmented recurrent nets," arXiv preprint arXiv:1503.01007, 2015.
18 S. Kang and K. Cho, "Conditional molecular design with deep generative models," Journal of Chemical Information and Modeling, Vol.59, No.1, pp.43-52, 2018.   DOI
19 R. Gomez-Bombarelli, et al., "Automatic chemical design using a data-driven continuous representation of molecules," ACS Central Science, Vol.4, No.2, pp.268-276, 2018.   DOI
20 M. Olivecrona, T. Blaschke, O. Engkvist, and H. Chen, "Molecular de-novo design through deep reinforcement learning," Journal of Cheminformatics, Vol.9, No.1, pp.1-14, 2017.   DOI
21 M. H. Segler, T. Kogej, C. Tyrchan, and M. P. Waller, "Generating focused molecule libraries for drug discovery with recurrent neural networks," ACS Central Science, Vol.4, No.1, pp.120-131, 2018.   DOI
22 F. Grisoni, M. Moret, R. Lingwood, and G. Schneider, "Bidirectional molecule generation with recurrent neural networks," Journal of Chemical Information and Modeling, Vol.60, No.3, pp.1175-1183, 2020.   DOI
23 R. Martinez, "PaccMannRL: Designing anticancer drugs from transcriptomic data via reinforcement learning," arXiv preprint arXiv:1909.05114, 2019.
24 M. Popova, O. Isayev, and A. Tropsha, "Deep reinforcement learning for de novo drug design," Science Advances, Vol.4, No.7, pp.eaap7885, 2018.   DOI
25 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.770-778, 2016.
26 R. J. Williams and D. Zipser, "A learning algorithm for continually running fully recurrent neural networks," Neural Computation, Vol.1, No.2, pp.270-280, 1989.   DOI
27 G. A. Bekey and K. Y. Goldberg, eds, "Neural networks in robotics," Springer Science & Business Media, Vol.202, 2012.