One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

Kelei He; Tiejun Dong; Jinhui Wu; Junfeng Zhang

One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

Kelei He, Tiejun Dong, Jinhui Wu, Junfeng Zhang

TL;DR

LigPose directly optimizes the three-dimensional structure of the protein-ligand complex, with the learning of binding strength and atomic interactions as auxiliary tasks, enabling its one-step prediction ability without docking tools.

Abstract

Understanding the structure of the protein-ligand complex is crucial to drug development. Existing virtual structure measurement and screening methods are dominated by docking and its derived methods combined with deep learning. However, the sampling and scoring methodology have largely restricted the accuracy and efficiency. Here, we show that these two fundamental tasks can be accurately tackled with a single model, namely LigPose, based on multi-task geometric deep learning. By representing the ligand and the protein pair as a graph, LigPose directly optimizes the three-dimensional structure of the complex, with the learning of binding strength and atomic interactions as auxiliary tasks, enabling its one-step prediction ability without docking tools. Extensive experiments show LigPose achieved state-of-the-art performance on major tasks in drug research. Its considerable improvements indicate a promising paradigm of AI-based pipeline for drug development.

One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

TL;DR

Abstract

Paper Structure (58 sections, 24 equations, 15 figures, 11 tables)

This paper contains 58 sections, 24 equations, 15 figures, 11 tables.

Introduction
Results
LigPose pipeline
Predicting accurate complex structures
Ligands with various flexibilities
Accurate and fast screening
Validating on the SARS-CoV-2 $\text{M}^{\text{pro}}$
Learning non-covalent interactions
Discussion
Methods
Data collection
Benchmark Dataset
Real-world dataset
Pre-processing
LigPose architecture
...and 43 more sections

Figures (15)

Figure 1: LigPose predicts ligand-protein complex structure using geometric deep learning in an end-to-end manner, compared with the conventional docking method. (a) Notations of a ligand, a protein pocket, and their complex. (b) The pipeline of the conventional docking tool consists of two stages, i.e., sampling and rankingmeng2011molecular. (c) The pipeline of our method. (d) Architecture of LigPose.
Figure 2: Performance of LigPose on ligand-binding conformation prediction compared with popular molecular docking tools and hybrid deep learning methods. (a) Visualization of the generated poses of LigPose and a docking tool (Sminasmina) for four ligands with various weights and number of rotatable bonds (PDB codes: 3RSX, 1P1Q, 3DXG, 4JXS). The native poses, predictions of LigPose, and docking tools are denoted as white, green, and cyan backbones, respectively. Within the predictions, the oxygen and nitrogen atoms are denoted as red and blue colors, respectively. (b) Quantitative comparison of success rate between LigPose and the top-scored poses generated by $12$ docking tools on the refined set of PDBbind. (c) Cumulative distribution of RMSD of LigPose. The red dashed line indicates the RMSD threshold of 2Å. Blue, orange, and green colors denote all ligands, the regular organic ligands, and the peptide/peptide-like ligands, respectively. (d) Quantitative comparison of success rate between LigPose, the docking tools, and the hybrid deep learning methods on the core set of PDBbind.
Figure 3: Performance of LigPose with respect to the ligand flexibility. (a) RMSD trajectory of two representative samples (PDB codes: 1O5B and 1EBY). LigPose updates a given ligand $24$ times using $4$ cycles, with each cycle updating the ligand $6$ times. The ligand atoms in the $\{1,8,16,24\}$th updates are visualized at the panel's top and bottom. The predicted atoms are denoted with orange and cyan colors for 1O5B and 1EBY, respectively. The oxygen, nitrogen, and iodine atoms are denoted with red, blue, and purple colors, respectively. The native poses are placed in the background with a grey color. (b-c) RMSD (b) and success rate (c) for ligands with respect to the number of rotatable bonds on the core set of PDBbind. The red dashed line indicates the RMSD threshold of 2Å. (d) Inference time of LigPose and four popular docking tools on $1000$ randomly selected samples in the PDBbind dataset.
Figure 4: Screening power of LigPose on the CASF-2016 benchmark. (a) Average enhancement factors of forward screening. (b) Success rates of forward screening. (c) Success rates of reverse screening.
Figure 5: Applications of LigPose on drug research for SARS-CoV-2 $\text{M}^{\text{pro}}$. (a) Two visualized samples of $\text{M}^{\text{pro}}$ complexes (PDB codes: 5RGU and 7ANS). Predictions of LigPose are denoted as green and cyan backbones. Within the predictions, the oxygen, nitrogen, and sulfur atoms are denoted as red, blue, and yellow colors, respectively. (b) Success rates of structure prediction for $\text{M}^{\text{pro}}$ complexes. (c) Success rates of virtual screening for $\text{M}^{\text{pro}}$ inhibitors.
...and 10 more figures

One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

TL;DR

Abstract

One-step Structure Prediction and Screening for Protein-Ligand Complexes using Multi-Task Geometric Deep Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (15)