Table of Contents
Fetching ...

Generating 3D Binding Molecules Using Shape-Conditioned Diffusion Models with Guidance

Ziqi Chen, Bo Peng, Tianhua Zhai, Daniel Adu-Ampratwum, Xia Ning

TL;DR

This work introduces DiffSMol, a diffusion-based method for generating 3D binding molecules conditioned on ligand-shape representations and optionally guided by protein pockets. It combines a pre-trained equivariant shape encoder with a diffusion model that directly places atoms in 3D space, and integrates shape-guidance (SG) and pocket-guidance (PG) to align generated molecules with target shapes and pockets. Across SMG and PMG benchmarks, DiffSMol variants outperform state-of-the-art baselines in shape similarity, structural realism, and binding-related metrics, while offering improved efficiency in PMG settings. Case studies on CDK6 and NEP illustrate promising drug-like properties, ADMET profiles, and docking scores, signaling potential utility in early-stage drug candidate discovery. The framework supports learning from large-scale molecule data and demonstrates effective shape-conditioned generation that can adapt to LBDD and SBDD regimes with controllable guidance mechanisms.

Abstract

Drug development is a critical but notoriously resource- and time-consuming process. In this manuscript, we develop a novel generative artificial intelligence (genAI) method DiffSMol to facilitate drug development. DiffSmol generates 3D binding molecules based on the shapes of known ligands. DiffSMol encapsulates geometric details of ligand shapes within pre-trained, expressive shape embeddings and then generates new binding molecules through a diffusion model. DiffSMol further modifies the generated 3D structures iteratively via shape guidance to better resemble the ligand shapes. It also tailors the generated molecules toward optimal binding affinities under the guidance of protein pockets. Here, we show that DiffSMol outperforms the state-of-the-art methods on benchmark datasets. When generating binding molecules resembling ligand shapes, DiffSMol with shape guidance achieves a success rate 61.4%, substantially outperforming the best baseline (11.2%), meanwhile producing molecules with novel molecular graph structures. DiffSMol with pocket guidance also outperforms the best baseline in binding affinities by 13.2%, and even by 17.7% when combined with shape guidance. Case studies for two critical drug targets demonstrate very favorable physicochemical and pharmacokinetic properties of the generated molecules, thus, the potential of DiffSMol in developing promising drug candidates.

Generating 3D Binding Molecules Using Shape-Conditioned Diffusion Models with Guidance

TL;DR

This work introduces DiffSMol, a diffusion-based method for generating 3D binding molecules conditioned on ligand-shape representations and optionally guided by protein pockets. It combines a pre-trained equivariant shape encoder with a diffusion model that directly places atoms in 3D space, and integrates shape-guidance (SG) and pocket-guidance (PG) to align generated molecules with target shapes and pockets. Across SMG and PMG benchmarks, DiffSMol variants outperform state-of-the-art baselines in shape similarity, structural realism, and binding-related metrics, while offering improved efficiency in PMG settings. Case studies on CDK6 and NEP illustrate promising drug-like properties, ADMET profiles, and docking scores, signaling potential utility in early-stage drug candidate discovery. The framework supports learning from large-scale molecule data and demonstrates effective shape-conditioned generation that can adapt to LBDD and SBDD regimes with controllable guidance mechanisms.

Abstract

Drug development is a critical but notoriously resource- and time-consuming process. In this manuscript, we develop a novel generative artificial intelligence (genAI) method DiffSMol to facilitate drug development. DiffSmol generates 3D binding molecules based on the shapes of known ligands. DiffSMol encapsulates geometric details of ligand shapes within pre-trained, expressive shape embeddings and then generates new binding molecules through a diffusion model. DiffSMol further modifies the generated 3D structures iteratively via shape guidance to better resemble the ligand shapes. It also tailors the generated molecules toward optimal binding affinities under the guidance of protein pockets. Here, we show that DiffSMol outperforms the state-of-the-art methods on benchmark datasets. When generating binding molecules resembling ligand shapes, DiffSMol with shape guidance achieves a success rate 61.4%, substantially outperforming the best baseline (11.2%), meanwhile producing molecules with novel molecular graph structures. DiffSMol with pocket guidance also outperforms the best baseline in binding affinities by 13.2%, and even by 17.7% when combined with shape guidance. Case studies for two critical drug targets demonstrate very favorable physicochemical and pharmacokinetic properties of the generated molecules, thus, the potential of DiffSMol in developing promising drug candidates.

Paper Structure

This paper contains 51 sections, 53 equations, 7 figures, 18 tables, 3 algorithms.

Figures (7)

  • Figure 1: Heatmaps of Similarities Calculated from Molecules Generated by $\mathop{\mathsf{SQUID}}\limits$ and $\mathop{\mathsf{DiffSMol}}\limits$.
  • Figure 2: Generated 3D Molecules from Different Methods. Molecule 3D shapes are in shades; generated molecules are superpositioned with the condition molecule; and the molecular graphs of generated molecules are presented.
  • Figure 3: Generated drug candidate NL-001 for CDK6
  • Figure 4: Generated drug candidate NL-002 for CDK6
  • Figure 5: Generated drug candidate NL-003 for NEP
  • ...and 2 more figures