Table of Contents
Fetching ...

Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration

Haitao Lin, Yufei Huang, Odin Zhang, Lirong Wu, Siyuan Li, Zhiyuan Chen, Stan Z. Li

TL;DR

D3FG introduces a functional-group-based diffusion framework for pocket-specific molecule generation and elaboration, overcoming atom-level generation limitations by treating functional groups as rigid bodies and linkers as mass points within SE(3)-equivariant denoisers. The method supports joint and two-stage generation on a heterogeneous graph, producing more realistic 3D structures and competitive binding affinities, and extends to molecule elaboration using fragment hotspot maps. Key contributions include a curated functional-group corpus, a diffusion-based generative pipeline with three coordinated diffusion processes (types, positions, orientations), and an elaboration task that demonstrates affinity improvements while preserving core skeletons. Empirical results on CrossDocked2020 show that the two-stage D3FG variant achieves superior structural realism and drug properties, with elaboration experiments indicating potential for affinity-driven optimization in a target-aware setting.

Abstract

In recent years, AI-assisted drug design methods have been proposed to generate molecules given the pockets' structures of target proteins. Most of them are atom-level-based methods, which consider atoms as basic components and generate atom positions and types. In this way, however, it is hard to generate realistic fragments with complicated structures. To solve this, we propose D3FG, a functional-group-based diffusion model for pocket-specific molecule generation and elaboration. D3FG decomposes molecules into two categories of components: functional groups defined as rigid bodies and linkers as mass points. And the two kinds of components can together form complicated fragments that enhance ligand-protein interactions. To be specific, in the diffusion process, D3FG diffuses the data distribution of the positions, orientations, and types of the components into a prior distribution; In the generative process, the noise is gradually removed from the three variables by denoisers parameterized with designed equivariant graph neural networks. In the experiments, our method can generate molecules with more realistic 3D structures, competitive affinities toward the protein targets, and better drug properties. Besides, D3FG as a solution to a new task of molecule elaboration, could generate molecules with high affinities based on existing ligands and the hotspots of target proteins.

Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration

TL;DR

D3FG introduces a functional-group-based diffusion framework for pocket-specific molecule generation and elaboration, overcoming atom-level generation limitations by treating functional groups as rigid bodies and linkers as mass points within SE(3)-equivariant denoisers. The method supports joint and two-stage generation on a heterogeneous graph, producing more realistic 3D structures and competitive binding affinities, and extends to molecule elaboration using fragment hotspot maps. Key contributions include a curated functional-group corpus, a diffusion-based generative pipeline with three coordinated diffusion processes (types, positions, orientations), and an elaboration task that demonstrates affinity improvements while preserving core skeletons. Empirical results on CrossDocked2020 show that the two-stage D3FG variant achieves superior structural realism and drug properties, with elaboration experiments indicating potential for affinity-driven optimization in a target-aware setting.

Abstract

In recent years, AI-assisted drug design methods have been proposed to generate molecules given the pockets' structures of target proteins. Most of them are atom-level-based methods, which consider atoms as basic components and generate atom positions and types. In this way, however, it is hard to generate realistic fragments with complicated structures. To solve this, we propose D3FG, a functional-group-based diffusion model for pocket-specific molecule generation and elaboration. D3FG decomposes molecules into two categories of components: functional groups defined as rigid bodies and linkers as mass points. And the two kinds of components can together form complicated fragments that enhance ligand-protein interactions. To be specific, in the diffusion process, D3FG diffuses the data distribution of the positions, orientations, and types of the components into a prior distribution; In the generative process, the noise is gradually removed from the three variables by denoisers parameterized with designed equivariant graph neural networks. In the experiments, our method can generate molecules with more realistic 3D structures, competitive affinities toward the protein targets, and better drug properties. Besides, D3FG as a solution to a new task of molecule elaboration, could generate molecules with high affinities based on existing ligands and the hotspots of target proteins.
Paper Structure (46 sections, 23 equations, 5 figures, 11 tables, 1 algorithm)

This paper contains 46 sections, 23 equations, 5 figures, 11 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of the workflows of D3FG of the two schemes.
  • Figure 2: Five of twenty-five functional groups with stable structures that occur most frequently in Crossdocked2020 and are used in D3FG.
  • Figure 3: Atom type distribution and metrics.
  • Figure 4: Generated molecules by different methods on pocket 3o96_A_rec. The diffusion-based methods generated molecules more similar to the reference, appearing to be 'vertical'.
  • Figure 5: Affinity metrics with the change of repository sizes.