- Article
- Published:
An adaptive graph learning method for automated molecular interactions and properties predictions
- Yuquan Li ORCID:orcid.org/0000-0003-2756-04491,2 na1,
- Chang-Yu Hsieh2 na1,
- Ruiqiang Lu1,
- Xiaoqing Gong1,
- Xiaorui Wang3,
- Pengyong Li ORCID:orcid.org/0000-0001-5971-046X4,
- Shuo Liu5,
- Yanan Tian5,
- Dejun Jiang ORCID:orcid.org/0000-0002-2035-50746,
- Jiaxian Yan7,
- Qifeng Bai ORCID:orcid.org/0000-0001-7296-61878,
- Huanxiang Liu5,
- Shengyu Zhang2 &
- …
- Xiaojun Yao ORCID:orcid.org/0000-0002-8974-01731,3
Nature Machine Intelligencevolume 4, pages645–651 (2022)Cite this article
4866Accesses
2Altmetric
Apreprint version of the article is available at Research Square.
Abstract
Improving drug discovery efficiency is a core and long-standing challenge in drug discovery. For this purpose, many graph learning methods have been developed to search potential drug candidates with fast speed and low cost. In fact, the pursuit of high prediction performance on a limited number of datasets has crystallized their architectures and hyperparameters, making them lose advantage in repurposing to new data generated in drug discovery. Here we propose a flexible method that can adapt to any dataset and make accurate predictions. The proposed method employs an adaptive pipeline to learn from a dataset and output a predictor. Without any manual intervention, the method achieves far better prediction performance on all tested datasets than traditional methods, which are based on hand-designed neural architectures and other fixed items. In addition, we found that the proposed method is more robust than traditional methods and can provide meaningful interpretability. Given the above, the proposed method can serve as a reliable method to predict molecular interactions and properties with high adaptability, performance, robustness and interpretability. This work takes a solid step forward to the purpose of aiding researchers to design better drugs with high efficiency.
This is a preview of subscription content,access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
9,800 Yen / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
¥14,900 per year
only ¥1,242 per issue
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
Code availability
All code of GLAM is freely available athttps://github.com/yvquanli/GLAM with an MIT licence. The version used for this publication is available athttps://doi.org/10.5281/zenodo.637116443.
References
Schneider, G. Automating drug discovery.Nat. Rev. Drug Discov.17, 97–113 (2018).
Schneider, P. et al. Rethinking drug design in the artificial intelligence era.Nat. Rev. Drug Discov.19, 353–364 (2020).
Inglese, J. & Auld, D. S. inWiley Encyclopedia of Chemical Biology (ed. Begley, T. P.) (Wiley, 2008);https://doi.org/10.1002/9780470048672.wecb223
Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W. Computational methods in drug discovery.Pharmacol. Rev.66, 334–395 (2014).
Fleming, N. How artificial intelligence is changing drug discovery.Nature557, S55–S57 (2018).
Zheng, S., Li, Y., Chen, S., Xu, J. & Yang, Y. Predicting drug–protein interaction using quasi-visual question answering system.Nat. Mach. Intell.2, 134–140 (2020).
Shen, W. X. et al. Out-of-the-box deep learning prediction of pharmaceutical properties by broadly learned knowledge-based molecular representations.Nat. Mach. Intell.3, 334–343 (2021).
Kotsias, P.-C. et al. Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks.Nat. Mach. Intell.2, 254–265 (2020).
Méndez-Lucio, O., Baillif, B., Clevert, D. A., Rouquié, D. & Wichard, J. De novo generation of hit-like molecules from gene expression signatures using artificial intelligence.Nat. Commun.11, 10 (2020).
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. & Blaschke, T. The rise of deep learning in drug discovery.Drug Discov. Today23, 1241–1250 (2018).
Jiang, S. & Balaprakash, P. Graph neural network architecture search for molecular property prediction. InProc. IEEE International Conference on Big Data 1346–1353 (IEEE, 2020).
Cai, S., Li, L., Deng, J., Zhang, B., Zha, Z. J., Su, L., & Huang, Q. Rethinking Graph Neural Architecture Search from Message-passing.Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 6653–6662.https://doi.org/10.1109/CVPR46437.2021.00659 (2021).
Zhang, Z., Wang, X., & Zhu, W. Automated Machine Learning on Graphs: A Survey.IJCAI International Joint Conference on Artificial Intelligence, 4704–4712.https://doi.org/10.24963/ijcai.2021/637 (2021)
Ekins, S. et al. Exploiting machine learning for end-to-end drug discovery and development.Nat. Mater.18, 435–441 (2019).
Sculley, D. et al. Hidden technical debt in machine learning systems. InProc. Advances in Neural Information Processing SystemsVol. 2015-January, 2503–2511 (NIPS, 2015).
Jiang, M. et al. Drug–target affinity prediction using graph neural network and contact maps.RSC Adv.10, 20701–20712 (2020).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. InProc. 2017 International Conference on Learning Representations (ICLR, 2017).
Veličković, P. et al. Graph attention networks. InProc. 2018 International Conference on Learning Representations 1–12 (ICLR, 2018).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. InProc. International Conference on Machine Learning Vol. 3, 2053–2070 (ACM, 2017).
Xiong, Z. et al. Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism.J. Med. Chem.https://doi.org/10.1021/acs.jmedchem.9b00959 (2019).
Xu, K., Jegelka, S., Hu, W. & Leskovec, J. How powerful are graph neural networks? InProc. 7th International Conference on Learning Representations,ICLR 2019 (ICLR, 2019).
Källberg, M. et al. Template-based protein structure modeling using the RaptorX web server.Nat. Protoc.7, 1511–1522 (2012).
Li, H., Leung, K. S., Wong, M. H. & Ballester, P. J. Improving AutoDock Vina using Random Forest: the growing accuracy of binding affinity prediction by the effective exploitation of larger data sets.Mol. Informatics34, 115–126 (2015).
Chen, L. et al. TransformerCPI: improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments.Bioinformatics36, 4406–4414 (2020).
Huang, K., Xiao, C., Hoang, T., Glass, L. & Sun, J. CASTER: predicting drug interactions with chemical substructure representation.Proc. AAAI Conf. Artif. Intell.34, 702–709 (2020).
Yang, Y.-Y., Rashtchian, C., Zhang, H., Salakhutdinov, R. & Chaudhuri, K. A closer look at accuracy vs. robustness. InProc. 34th International Conference on Neural Information Processing Systems Vol. 720, 8588–8601 (NIPS, 2020).
Tetko, I. V., Tanchuk, V. Y. & Villa, A. E. P. Prediction ofn-octanol/water partition coefficients from PHYSPROP database using artificial neural networks and E-state indices.J. Chem. Inf. Comput. Sci.41, 1407–1421 (2001).
Zeng, Y., Chen, X., Luo, Y., Li, X. & Peng, D. Deep drug–target binding affinity prediction with multiple attention blocks.Briefings Bioinform.22, bbab117 (2021).
Withnall, M., Lindelöf, E., Engkvist, O. & Chen, H. Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction.J. Cheminform.12, 1–18 (2020).
Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M. & Hutter, F. Auto-Sklearn 2.0: the next generation (2020);https://www.researchgate.net/publication/342801746_Auto-Sklearn_20_The_Next_Generation
Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., & Smola, A. AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data.ICML Workshop on Automated Machine Learning (2020).
Rogers, D. & Hahn, M. Extended-connectivity fingerprints.J. Chem. Inf. Model.50, 742–754 (2010).
Xiong, J., Xiong, Z., Chen, K., Jiang, H. & Zheng, M. Graph neural networks for automated de novo drug design.Drug Discov. Today26, 1382–1393 (2021).
Dai, H. et al. Retrosynthesis prediction with conditional graph logic network. InProc. 33rd International Conference on Neural Information Processing Systems Vol. 796, 8872–8882 (NIPS, 2020).
Wang, X. et al. RetroPrime: a diverse, plausible and transformer-based method for single-step retrosynthesis predictions.Chem. Eng. J.420, 129845 (2021).
Kuznetsov, M. & Polykovskiy, D. MolGrow: a graph normalizing flow for hierarchical molecular generation. InProc. AAAI Conference on Artificial Intelligence Vol. 35, 8226–8234 (AAAI, 2021).
Luo, Y., Yan, K. & Ji, S. GraphDF: a discrete flow model for molecular graph generation. InProc. 38th International Conference on Machine Learning, PMLR Vol. 139, 7192–7203 (PMLR, 2021).
Liu, M., Yan, K., Oztekin, B. & Ji, S. GraphEBM: molecular graph generation with energy-based models.Proc. ILCR Workshop on Energy Based Models 1–16 (2021).
Tran-Nguyen, V. K., Jacquemard, C. & Rognan, D. LIT-PCBA: an unbiased data set for machine learning and virtual screening.J. Chem. Inf. Model.60, 4263–4273 (2020).
Gilson, M. K. et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology.Nucleic Acids Res.44, D1045–D1053 (2016).
Wishart, D. S. et al. DrugBank: a knowledge base for drugs, drug actions and drug targets.Nucleic Acids Res.36, D901–D906 (2008).
Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning.Chem. Sci.9, 513–530 (2018).
Li, Y. Code for ‘An adaptive graph learning method for automated molecular interactions and properties predictions’ (Zenodo, 2022);https://doi.org/10.5281/zenodo.6371164
Halgren, T. A. et al. Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening.J. Med. Chem.47, 1750–1759 (2004).
Fey, M. & Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. InProc. ICLR 2019 Workshop on Representation Learning on Graphs and Manifolds (ICLR, 2019);https://arxiv.org/abs/1903.02428
Acknowledgements
This work was supported by the National Natural Science Foundation of China (22173038 and 21775060). We thank the Supercomputing Center of Lanzhou University for providing high-performance computing resources. We acknowledge help from J. Xu, the author of RaptorX22, as well as help from M. Jiang, the author of DGraphDTA16.
Author information
These authors contributed equally: Yuquan Li, Chang-Yu Hsieh.
Authors and Affiliations
College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, China
Yuquan Li, Ruiqiang Lu, Xiaoqing Gong & Xiaojun Yao
Tencent Quantum Laboratory, Tencent, Shenzhen, China
Yuquan Li, Chang-Yu Hsieh & Shengyu Zhang
State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Macau, China
Xiaorui Wang & Xiaojun Yao
School of Computer Science and Technology, Xidian University, Xian, China
Pengyong Li
School of Pharmacy, Lanzhou University, Lanzhou, China
Shuo Liu, Yanan Tian & Huanxiang Liu
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Dejun Jiang
School of Data Science, University of Science and Technology of China, Hefei, China
Jiaxian Yan
School of Basic Medical Sciences, Lanzhou University, Lanzhou, China
Qifeng Bai
- Yuquan Li
You can also search for this author inPubMed Google Scholar
- Chang-Yu Hsieh
You can also search for this author inPubMed Google Scholar
- Ruiqiang Lu
You can also search for this author inPubMed Google Scholar
- Xiaoqing Gong
You can also search for this author inPubMed Google Scholar
- Xiaorui Wang
You can also search for this author inPubMed Google Scholar
- Pengyong Li
You can also search for this author inPubMed Google Scholar
- Shuo Liu
You can also search for this author inPubMed Google Scholar
- Yanan Tian
You can also search for this author inPubMed Google Scholar
- Dejun Jiang
You can also search for this author inPubMed Google Scholar
- Jiaxian Yan
You can also search for this author inPubMed Google Scholar
- Qifeng Bai
You can also search for this author inPubMed Google Scholar
- Huanxiang Liu
You can also search for this author inPubMed Google Scholar
- Shengyu Zhang
You can also search for this author inPubMed Google Scholar
- Xiaojun Yao
You can also search for this author inPubMed Google Scholar
Contributions
Y.L., C.-Y.H. and X.Y. conceived the project. Y.L., C.-Y.H., R.L., X.G., X.W. and P.L. designed and conducted the experiments. C.-Y.H., S.L., Y.T., D.J., J.Y., Q.B. and H.L. evaluated the experiments and contributed ideas. S.Z., C.-Y.H. and X.Y. managed and supervised the project. All authors co-wrote the manuscript.
Corresponding author
Correspondence toXiaojun Yao.
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks William McCorkindale and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Design space for blocks of the architectures.
a, Feed-forward Block. It takes a tensor as input and outputs a tensor. Abbreviations and their full name correspond as follows: Norm(Normalization), ReLU(Rectified linear units), CeLU(Continuously differentiable exponential linear units).b, Message Passing Block. It takes a graph as input and outputs a graph. Abbreviations and their full name correspond as follows: GCN(Graph convolutional networks), GAT(Graph attention networks), MPN(Message-passing neural networks), Tri-MPN(Triplet message-passing neural networks), Light Tri-MPN(Light triplet message-passing neural networks).c, Fusion Block. It takes a graph as input and outputs a tensor. Dot means the dot multiplication operation.d, Global Pooling Block. It takes a graph as input and outputs a tensor.
Extended Data Fig. 2 Cases of node-level interpretation.
a, Case studies of solubility prediction. The atoms in the hydrophilic group tend to be bluer in our visualization, which means their weights are closer to 1. In contrast, the atoms in the lipophilic group tend to be redder in our visualization, which means their weights are closer to −1.b, Case studies of drug-drug interactions. The visualization results show the models in predictor pay more attention to the nitrates of isosorbide dinitrate and nicorandil, and pay more attention to the N-methyl of sildenafil and udenafil.
Supplementary information
Supplementary Information
Supplementary Tables 1–5.
Rights and permissions
About this article
Cite this article
Li, Y., Hsieh, CY., Lu, R.et al. An adaptive graph learning method for automated molecular interactions and properties predictions.Nat Mach Intell4, 645–651 (2022). https://doi.org/10.1038/s42256-022-00501-8
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative