Table of Contents
Fetching ...

General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design

Yue Jian, Curtis Wu, Danny Reidenbach, Aditi S. Krishnapriyan

TL;DR

BADGER addresses the challenge of guiding binding affinity in diffusion-based structure-based drug design (SBDD). It introduces two complementary strategies—Classifier Guidance (CG) and Classifier-Free Guidance (CFG)—to steer samples toward target affinity values, with a Gaussian-based continuous-conditioning energy and a multi-constraint extension for QED and SA. Evaluated on CrossDocked2020 and PDBBind, BADGER yields up to 60% improvements in ligand--protein binding energy over prior diffusion methods while preserving chemical validity, and it extends naturally to multi-property diffusion guidance. The framework is modular and plug-and-play, enabling affinity-aware diffusion across backbones and pockets, with public code for reproducibility and broad potential impact on accelerating drug discovery.

Abstract

Structure-based drug design (SBDD) aims to generate ligands that bind strongly and specifically to target protein pockets. Recent diffusion models have advanced SBDD by capturing the distributions of atomic positions and types, yet they often underemphasize binding affinity control during generation. To address this limitation, we introduce \textbf{\textnormal{\textbf{BADGER}}}, a general \textbf{binding-affinity guidance framework for diffusion models in SBDD}. \textnormal{\textbf{BADGER} }incorporates binding affinity awareness through two complementary strategies: (1) \textit{classifier guidance}, which applies gradient-based affinity signals during sampling in a plug-and-play fashion, and (2) \textit{classifier-free guidance}, which integrates affinity conditioning directly into diffusion model training. Together, these approaches enable controllable ligand generation guided by binding affinity. \textnormal{\textbf{BADGER} } can be added to any diffusion model and achieves up to a \textbf{60\% improvement in ligand--protein binding affinity} of sampled molecules over prior methods. Furthermore, we extend the framework to \textbf{multi-constraint diffusion guidance}, jointly optimizing for binding affinity, drug-likeness (QED), and synthetic accessibility (SA) to design realistic and synthesizable drug candidates.

General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design

TL;DR

BADGER addresses the challenge of guiding binding affinity in diffusion-based structure-based drug design (SBDD). It introduces two complementary strategies—Classifier Guidance (CG) and Classifier-Free Guidance (CFG)—to steer samples toward target affinity values, with a Gaussian-based continuous-conditioning energy and a multi-constraint extension for QED and SA. Evaluated on CrossDocked2020 and PDBBind, BADGER yields up to 60% improvements in ligand--protein binding energy over prior diffusion methods while preserving chemical validity, and it extends naturally to multi-property diffusion guidance. The framework is modular and plug-and-play, enabling affinity-aware diffusion across backbones and pockets, with public code for reproducibility and broad potential impact on accelerating drug discovery.

Abstract

Structure-based drug design (SBDD) aims to generate ligands that bind strongly and specifically to target protein pockets. Recent diffusion models have advanced SBDD by capturing the distributions of atomic positions and types, yet they often underemphasize binding affinity control during generation. To address this limitation, we introduce \textbf{\textnormal{\textbf{BADGER}}}, a general \textbf{binding-affinity guidance framework for diffusion models in SBDD}. \textnormal{\textbf{BADGER} }incorporates binding affinity awareness through two complementary strategies: (1) \textit{classifier guidance}, which applies gradient-based affinity signals during sampling in a plug-and-play fashion, and (2) \textit{classifier-free guidance}, which integrates affinity conditioning directly into diffusion model training. Together, these approaches enable controllable ligand generation guided by binding affinity. \textnormal{\textbf{BADGER} } can be added to any diffusion model and achieves up to a \textbf{60\% improvement in ligand--protein binding affinity} of sampled molecules over prior methods. Furthermore, we extend the framework to \textbf{multi-constraint diffusion guidance}, jointly optimizing for binding affinity, drug-likeness (QED), and synthetic accessibility (SA) to design realistic and synthesizable drug candidates.

Paper Structure

This paper contains 61 sections, 26 equations, 10 figures, 17 tables, 4 algorithms.

Figures (10)

  • Figure 1: Overview of BADGER, a general guidance framework for diffusion-based molecular generation. Top: unguided (left) vs. guided (right) sampling trajectories from Gaussian noise to molecular structures. BADGER(right) employs either classifier guidance (gradient-based refinement using a trained classifier) or classifier-free guidance (mixture of conditional and unconditional noise predictions). Bottom: evolution of binding energy distributions $P_t(\Delta G)$, showing that guided sampling (right) under BADGERshifts samples toward lower binding energies.
  • Figure 2: Improvement in median Vina Scores across 100 protein pockets after applying Classifier Guidance in BADGER. Each panel corresponds to a diffusion model variant: TargetDiff (top), DecompDiff Ref (middle), and DecompDiff Beta (bottom). For each pocket, we compare the median Vina Score before (blue) and after (orange) applying classifier-guided sampling. Across all models, BADGERconsistently improves binding quality—achieving lower Vina Scores for $99\%$ of the pockets (lower is better $\downarrow$). For a few outlier pockets, the unguided model’s scores exceed the plotted range, yet classifier guidance still yields notable improvements.
  • Figure 3: Distribution of Vina Scores for molecules generated with and without BADGER guidance across diffusion model baselines. Shown are results for TargetDiff (left), DecompDiff Ref (middle), and DecompDiff Beta (right). Each plot compares unguided sampling (pink) with guided variants, including Classifier Guidance (CG), Classifier-Free Guidance (CFG), and Multi-Constraint Classifier Guidance (MC-CG). Across all models, BADGERconsistently lowers both the mean and median Vina Scores and shifts the entire distribution toward lower (better) binding energies, potentially suggesting more favorable protein–ligand interactions.
  • Figure 4: Steric Clashes Score improvement with BADGER across diffusion model variants. Each box plot reports the distribution (log scale) of steric clashes scores for generated ligand poses reconstructed from sampled molecules. Lower values indicate fewer atomic overlaps and more physically stable conformations. Across all diffusion models—TargetDiff, DecompDiff Ref, and DecompDiff Beta—BADGERsystematically reduces steric clashes under Classifier Guidance (CG), Classifier-Free Guidance (CFG), and Multi-Constraint Classifier Guidance (MC-CG), potentially suggesting improved geometric plausibility of generated poses.
  • Figure 5: We visualize the improvement in median Vina Score on each of the 100 pockets in the test set for each diffusion model (TargetDiff, DecompDiff Ref, and DecompDiff Beta) after applying Classifier-Free Guidance (CFG) and Multi-Constraints Classifier Guidance (MC-CG) versions of BADGER. BADGER improves the median Vina Score for most of the protein pockets.
  • ...and 5 more figures