Table of Contents
Fetching ...

Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design

Shengchao Liu, Divin Yan, Weitao Du, Weiyang Liu, Zhuoxinran Li, Hongyu Guo, Christian Borgs, Jennifer Chayes, Anima Anandkumar

TL;DR

This work introduces a learning approach that incorporates motivated geometric constraints to improve the physical plausibility and binding affinity of generated molecular structures and demonstrates improved molecular realism and practical benefits for drug design tasks.

Abstract

Artificial intelligence models have shown great potential in structure-based drug design, generating ligands with high binding affinities. However, existing models have often overlooked a crucial physical constraint: atoms must maintain a minimum pairwise distance to avoid separation violation, a phenomenon governed by the balance of attractive and repulsive forces. To mitigate such separation violations, we propose NucleusDiff. It models the interactions between atomic nuclei and their surrounding electron clouds by enforcing the distance constraint between the nuclei and manifolds. We quantitatively evaluate NucleusDiff using the CrossDocked2020 dataset and a COVID-19 therapeutic target, demonstrating that NucleusDiff reduces violation rate by up to 100.00% and enhances binding affinity by up to 22.16%, surpassing state-of-the-art models for structure-based drug design. We also provide qualitative analysis through manifold sampling, visually confirming the effectiveness of NucleusDiff in reducing separation violations and improving binding affinities.

Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design

TL;DR

This work introduces a learning approach that incorporates motivated geometric constraints to improve the physical plausibility and binding affinity of generated molecular structures and demonstrates improved molecular realism and practical benefits for drug design tasks.

Abstract

Artificial intelligence models have shown great potential in structure-based drug design, generating ligands with high binding affinities. However, existing models have often overlooked a crucial physical constraint: atoms must maintain a minimum pairwise distance to avoid separation violation, a phenomenon governed by the balance of attractive and repulsive forces. To mitigate such separation violations, we propose NucleusDiff. It models the interactions between atomic nuclei and their surrounding electron clouds by enforcing the distance constraint between the nuclei and manifolds. We quantitatively evaluate NucleusDiff using the CrossDocked2020 dataset and a COVID-19 therapeutic target, demonstrating that NucleusDiff reduces violation rate by up to 100.00% and enhances binding affinity by up to 22.16%, surpassing state-of-the-art models for structure-based drug design. We also provide qualitative analysis through manifold sampling, visually confirming the effectiveness of NucleusDiff in reducing separation violations and improving binding affinities.
Paper Structure (82 sections, 15 equations, 14 figures, 23 tables, 2 algorithms)

This paper contains 82 sections, 15 equations, 14 figures, 23 tables, 2 algorithms.

Figures (14)

  • Figure 1: (a) Illustration of the nucleus, the electron cloud, and the manifold of an atom. The electron cloud represents the probabilistic distribution of electrons around the nucleus, and the manifold is the sphere corresponding to the average distance from the nucleus to the outermost electrons in the electron cloud. (b) Illustration of the manifold surrounding a molecule. (c) Illustration of the mesh points obtained from discretizing a manifold. (d) Pipeline of NucleusDiff. NucleusDiff performs denoising diffusion on both the nuclei and the discretized mesh points, where the distances between them approximate the van der Waals radii.
  • Figure 2: (a) Visualization of generated ligands for the target 2HCJ. (b) Visualization of the pairwise-level violation ratio in TargetDiff and NucleusDiff during the inference on the CrossDocked2020 dataset. (c) Visualization of the pairwise-level violation ratio in TargetDiff and NucleusDiff during the inference on the COVID-19 therapeutic target. (d) Visualization of the binding affinities (in Vina Dock) for 10k sampled ligands and given proteins.
  • Figure 3: Visualization of the pockets and sampled ligands on CrossDocked2020 and COVID-19. The sampled molecules are generated using TargetDiff and NucleusDiff. For NucleusDiff, we illustrate both the generated nucleus and manifold (marked in the green sphere). We also emphasize the use of the Vina Score to measure the quality of generated ligands, where a lower score indicates stronger binding affinity.
  • Figure 4: The comparison of using manifold constraint and minimum distance constraint.
  • Figure 5: The ring size distribution of molecules generated by the baseline models and NucleusDiff.
  • ...and 9 more figures