Table of Contents
Fetching ...

BInD: Bond and Interaction-generating Diffusion Model for Multi-objective Structure-based Drug Design

Joongwon Lee, Wonho Zhung, Jisu Seo, Woo Youn Kim

TL;DR

The paper tackles multi-objective structure-based drug design by balancing local geometry, molecular properties, and protein-ligand interactions in a diffusion framework. It introduces BInD, a diffusion model that co-generates bonds, atoms, and NCIs in a bipartite graph conditioned on a protein pocket, with $L_t$ and $I_t$ evolving through the reverse process $p_ heta$. Key contributions include a comprehensive benchmark showing balanced performance against baselines, a train-free NCI-driven design workflow, and a case study demonstrating target-selective design via NCI pattern retrieval and optimization (BInDopt). The approach reduces reliance on docking while delivering realistic 3D conformers and favorable NCIs, offering a practical path toward more reliable, selective drug discovery.

Abstract

A remarkable advance in geometric deep generative models with accumulated structural data enables structure-based drug design (SBDD) with target protein information only. However, most existing models struggle to address multi-objectives simultaneously while performing well only in their specialized tasks. Here, we present BInD, a diffusion model with knowledge-based guidance for multi-objective SBDD. BInD is designed to co-generate molecules and their interactions with a target protein to consider all key objectives equally well, including target-specific interactions, molecular properties, and local geometry. Comprehensive evaluations show that BInD achieves robust performance for all objectives while outperforming or matching state-of-the-art methods for each. Finally, we propose a train-free optimization method empowered by retrieving target-specific interactions, highlighting the role of non-covalent interactions in achieving higher selectivity and binding affinities to a target protein.

BInD: Bond and Interaction-generating Diffusion Model for Multi-objective Structure-based Drug Design

TL;DR

The paper tackles multi-objective structure-based drug design by balancing local geometry, molecular properties, and protein-ligand interactions in a diffusion framework. It introduces BInD, a diffusion model that co-generates bonds, atoms, and NCIs in a bipartite graph conditioned on a protein pocket, with and evolving through the reverse process . Key contributions include a comprehensive benchmark showing balanced performance against baselines, a train-free NCI-driven design workflow, and a case study demonstrating target-selective design via NCI pattern retrieval and optimization (BInDopt). The approach reduces reliance on docking while delivering realistic 3D conformers and favorable NCIs, offering a practical path toward more reliable, selective drug discovery.

Abstract

A remarkable advance in geometric deep generative models with accumulated structural data enables structure-based drug design (SBDD) with target protein information only. However, most existing models struggle to address multi-objectives simultaneously while performing well only in their specialized tasks. Here, we present BInD, a diffusion model with knowledge-based guidance for multi-objective SBDD. BInD is designed to co-generate molecules and their interactions with a target protein to consider all key objectives equally well, including target-specific interactions, molecular properties, and local geometry. Comprehensive evaluations show that BInD achieves robust performance for all objectives while outperforming or matching state-of-the-art methods for each. Finally, we propose a train-free optimization method empowered by retrieving target-specific interactions, highlighting the role of non-covalent interactions in achieving higher selectivity and binding affinities to a target protein.
Paper Structure (7 sections, 19 equations, 4 figures, 2 tables)

This paper contains 7 sections, 19 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overview of BInD. a, Conceptual illustration of the three key objectives of deep SBDD; accurate local geometry, desirable molecular property, and target-specific interactions. b, Overall performance of deep SBDD models. While other baseline models fall short in at least one of the three key objectives, BInD shows balanced performance with notable strength in considering interactions. c, An overview of the generative process of BInD, where atoms, bonds, and interactions are denoised explicitly and simultaneously. The 2D illustration below focuses on the types of each entity, where dashed lines indicate NCIs and 'U' indicates an absorbing type. d, A detailed illustration of a single generation step, in case the desired interaction pattern is given. A generating molecule at $t$ step is first denoised with a dynamic interaction network ($p_\theta$), then integrated with the given NCIs noised with a forward diffusion ($q)$. Then, knowledge-based guidance terms are applied to modify atom positions finely, obtaining a molecule at $t-1$ step.
  • Figure 2: Comprehensive assessment of BInD with baseline models.a, The box plot illustrates energy differences between before and after Vina minimization. b, The box plot illustrates energy differences between before and after Vina re-docking. c, Cumulative distribution function (CDF) of minimization RMSDs. d, The box plot illustrates strain energies. In each box plot, the median and four quartile values are denoted with black diamonds that indicate average values. Models are classified into two groups, depending on whether or not they use information from reference ligands. The reference-dependent models are differentiated as a shaded region in box plots and as dashed lines in CDF. e, Three examples of BInD generated molecules for the test pocket (PDB ID: 3KC1), its NCIs, and its Vina score, QED, and SA. Generated molecules exhibit higher Vina scores with preferable QED and SA scores compared to the reference. In terms of NCI, BInD generates molecules with hydrogen bonds targeting hydrophilic regions at the loop between two helices. Also, the molecules form new NCIs that are not present in the reference molecule, such as a salt bridge with an arginine.
  • Figure 3: Discerning the role of NCIs in SBDD. a, A histogram displaying the relative counts of NCIs averaged over the test data points. Each relative count is normalized to a value of 1.0 with the number of NCIs in the corresponding reference molecule. b, A box plot of the number of steric clashes between the generated ligand and protein. c, A box plot shows the NCI similarity of the generated and Vina minimized conformation. For both box plots, median and four quartile values are described, with black diamonds indicating the average values. The reference-dependent models are differentiated in the shaded region. d, Vina scores, minimization, and docking energies of reference molecules and molecules generated from the variants of BInD. As the ratio for NCI pattern retrieval from the initial generation decreases, the generated molecules consistently exhibit stronger bindings while preserving the gap between three energy components.
  • Figure 4: BInD as a promising framework for designing mutant-selective EGFR inhibitors.a, A box plot illustrates differences between Vina docking scores on the mutant and WT EGFR pockets, where positive values correspond to the lower score on the mutant. Median values are depicted as notches as well as four quartile values. b, Density plot showing the distributions of Vina docking scores on each mutant and WT EGFR pockets. A diagonal line divides a plot in half, where the upper area indicates where the docking score on the mutant pocket is lower. A black star indicates the reference ligand of its original crystal structure. c, $t$-SNE plot in the middle visualizes the NCI patterns generated from BInD, BInDopt1, and BInDopt2, with the color indicating the selectivity measure. Two $t$-SNE plots at the bottom correspond to BInD (right, gray) and BInDopt2 (left, blue), respectively. Examples of generated molecules are visualized with Vina minimized and docking poses and Vina docking scores. NCIs with the mutated residues -- M790 and R858 -- are shown as dashed lines.