Table of Contents
Fetching ...

SPIN: SE(3)-Invariant Physics Informed Network for Binding Affinity Prediction

Seungyeon Choi, Sangmin Seo, Sanghyun Park

TL;DR

SPIN addresses BA prediction by integrating SE(3)-invariant graph transformers with physics-informed priors that enforce rotational/translational invariance and minimal binding energy. It constructs a protein–ligand graph, processes it with a geometric transformer, and uses a pairwise interaction matrix to compute a van der Waals energy term, optimized with dual losses: data fit and physics consistency. On CASF-2016 and CSAR-HiQ, SPIN achieves state-of-the-art results and strong generalization, with ablations confirming the pivotal role of both inductive biases. The approach offers practical benefits for virtual screening and provides interpretable insights by linking predicted energies to biologically relevant residue interactions.

Abstract

Accurate prediction of protein-ligand binding affinity is crucial for rapid and efficient drug development. Recently, the importance of predicting binding affinity has led to increased attention on research that models the three-dimensional structure of protein-ligand complexes using graph neural networks to predict binding affinity. However, traditional methods often fail to accurately model the complex's spatial information or rely solely on geometric features, neglecting the principles of protein-ligand binding. This can lead to overfitting, resulting in models that perform poorly on independent datasets and ultimately reducing their usefulness in real drug development. To address this issue, we propose SPIN, a model designed to achieve superior generalization by incorporating various inductive biases applicable to this task, beyond merely training on empirical data from datasets. For prediction, we defined two types of inductive biases: a geometric perspective that maintains consistent binding affinity predictions regardless of the complexs rotations and translations, and a physicochemical perspective that necessitates minimal binding free energy along their reaction coordinate for effective protein-ligand binding. These prior knowledge inputs enable the SPIN to outperform comparative models in benchmark sets such as CASF-2016 and CSAR HiQ. Furthermore, we demonstrated the practicality of our model through virtual screening experiments and validated the reliability and potential of our proposed model based on experiments assessing its interpretability.

SPIN: SE(3)-Invariant Physics Informed Network for Binding Affinity Prediction

TL;DR

SPIN addresses BA prediction by integrating SE(3)-invariant graph transformers with physics-informed priors that enforce rotational/translational invariance and minimal binding energy. It constructs a protein–ligand graph, processes it with a geometric transformer, and uses a pairwise interaction matrix to compute a van der Waals energy term, optimized with dual losses: data fit and physics consistency. On CASF-2016 and CSAR-HiQ, SPIN achieves state-of-the-art results and strong generalization, with ablations confirming the pivotal role of both inductive biases. The approach offers practical benefits for virtual screening and provides interpretable insights by linking predicted energies to biologically relevant residue interactions.

Abstract

Accurate prediction of protein-ligand binding affinity is crucial for rapid and efficient drug development. Recently, the importance of predicting binding affinity has led to increased attention on research that models the three-dimensional structure of protein-ligand complexes using graph neural networks to predict binding affinity. However, traditional methods often fail to accurately model the complex's spatial information or rely solely on geometric features, neglecting the principles of protein-ligand binding. This can lead to overfitting, resulting in models that perform poorly on independent datasets and ultimately reducing their usefulness in real drug development. To address this issue, we propose SPIN, a model designed to achieve superior generalization by incorporating various inductive biases applicable to this task, beyond merely training on empirical data from datasets. For prediction, we defined two types of inductive biases: a geometric perspective that maintains consistent binding affinity predictions regardless of the complexs rotations and translations, and a physicochemical perspective that necessitates minimal binding free energy along their reaction coordinate for effective protein-ligand binding. These prior knowledge inputs enable the SPIN to outperform comparative models in benchmark sets such as CASF-2016 and CSAR HiQ. Furthermore, we demonstrated the practicality of our model through virtual screening experiments and validated the reliability and potential of our proposed model based on experiments assessing its interpretability.
Paper Structure (15 sections, 8 equations, 5 figures, 1 table)

This paper contains 15 sections, 8 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Two inductive biases of the protein–ligand BA prediction model: (A). Geometric inductive bias, i.e., the BA of a complex remains constant despite undergoing SE(3)-transformations, such as translations and rotations. (B). Physicochemical inductive bias, i.e., protein-ligand complex should be positioned at the point where binding free energy is minimal among their possible reaction coordinates.
  • Figure 2: Overview of SPIN. A. Protein-ligand complex preparation from PDBbind Dataset B. Atoms constituting a complex are defined as nodes, and the connections between atoms are defined as edges, representing the entire structure as a single graph C. Each node feature is updated through an SE(3)-Graph transformer, in conjunction with the features of the connected edges, to model the geometric information in the three-dimensional space of the complex D. The protein-ligand interaction matrix is extracted through matrix multiplication of the final node representation vectors of protein and ligand atoms. The computed interaction matrix is defined in terms of pairwise energy values between protein atoms and ligand atoms. $\textbf{E}_\textbf{1}$. The binding affinity is predicted by summing the values of the extracted pairwise interaction matrix. $\textbf{E}_\textbf{2}$. The binding free energy is minimized by enforcing that the derivatives of the pairwise distances between protein and ligand atoms in the extracted pairwise interaction matrix equal zero.
  • Figure 3: Ablation studies on CASF-2016, CSAR HiQ set. Performance for four evaluation metrics are presented for four cases of the SPIN model: the complete model with all inductive biases injected (SPIN), the case without geometric inductive bias (SPIN[w/o G]), the case without physicochemical inductive bias (SPIN[w/o P]), and the case with both biases removed (SPIN[w/o GP])).
  • Figure 4: Average Spearman correlation coefficient obtain on 57 target proteins by each scoring function included proposed SPIN in the ranking power test.
  • Figure 5: (A). Visualization of the 3bu1 PDB sample: Protein amino acids corresponding to the lowest 10$\%$ of the predicted protein-ligand interaction energy are highlighted in red. Additionally, the specific energy values are annotated alongside these amino acids. (B). Results from the Discovery Studio interaction profiler for the 3bu1 PDB sample show that a total of six amino acids are involved in critical interactions.