Assessing interaction recovery of predicted protein-ligand poses

David Errington; Constantin Schneider; Cédric Bouysset; Frédéric A. Dreyer

Assessing interaction recovery of predicted protein-ligand poses

David Errington, Constantin Schneider, Cédric Bouysset, Frédéric A. Dreyer

TL;DR

It is demonstrated that ignoring protein-ligand interaction fingerprints can lead to overestimation of model performance, most notably in recent protein-ligand cofolding models which often fail to recapitulate key interactions.

Abstract

The field of protein-ligand pose prediction has seen significant advances in recent years, with machine learning-based methods now being commonly used in lieu of classical docking methods or even to predict all-atom protein-ligand complex structures. Most contemporary studies focus on the accuracy and physical plausibility of ligand placement to determine pose quality, often neglecting a direct assessment of the interactions observed with the protein. In this work, we demonstrate that ignoring protein-ligand interaction fingerprints can lead to overestimation of model performance, most notably in recent protein-ligand cofolding models which often fail to recapitulate key interactions.

Assessing interaction recovery of predicted protein-ligand poses

TL;DR

Abstract

Paper Structure (13 sections, 5 figures, 1 table)

This paper contains 13 sections, 5 figures, 1 table.

Introduction
Method
Protein-ligand interaction fingerprint
Classical docking algorithms
ML docking algorithms
Protein-ligand cofolding
Data and Metrics
Results
PoseBusters benchmark
Interaction recovery rates
Recovery of different interaction types
Discussion
Correlation betweeen PLIF recovery and RMSD

Figures (5)

Figure 1: Left: Two-dimensional representation of the ligand EZO and its four interactions with the crystal structure 6M2B. Basic residues are shown in blue and residues containing a sulfur atom are shown in yellow. Right: Docked poses generated with GOLD, DiffDock-L and RosettaFold-AllAtom showing the calculated interactions for each model, with the ground truth ligand in grey.
Figure 2: The ratio of predicted protein-ligand complex structures for each model passing checks on ligand positioning (RMSD$\leq$2Å), physicality (PoseBuster-valid) and interaction recovery (PLIF-valid).
Figure 3: Recovery of protein-ligand interaction fingerprint for each model. The distribution of PLIF recovery among poses that pass the RMSD and PoseBuster test are shown in dashed and dotted lines.
Figure 4: Ratio to the ground truth of calculated and correctly recovered (recall) interactions shown separately for each interaction types.
Figure 5: PLIF recovery rate and RMSD, highlighting data points which are PoseBuster-valid. Note that we use a modified definition of PB-validity that excludes ligand RMSD. The red line indicates a ligand RMSD of 2Å.

Assessing interaction recovery of predicted protein-ligand poses

TL;DR

Abstract

Assessing interaction recovery of predicted protein-ligand poses

Authors

TL;DR

Abstract

Table of Contents

Figures (5)