Table of Contents
Fetching ...

Peptide2Mol: A Diffusion Model for Generating Small Molecules as Peptide Mimics for Targeted Protein Binding

Xinheng He, Yijia Zhang, Haowei Lin, Xingang Peng, Xiangzhe Kong, Mingyu Li, Jianzhu Ma

TL;DR

Peptide2Mol addresses the gap between peptide binders and drug-like small molecules by learning a trajectory from peptide–binding interfaces to pocket-fitting small-molecule mimics using an $E(3)$-equivariant diffusion model. It trains on diverse data—small molecules, protein–ligand complexes, and peptide–protein interfaces—and generates non-autoregressive, pocket-aware molecules that preserve peptide-like interactions while achieving drug-like properties. The method supports partial diffusion for peptidomimetic optimization and can refine results with Pocket2Mol to improve docking plausibility, with results showing competitive property metrics and residue-level mimicry validated by PMI analyses. This work advances structure-based design by enabling peptide-to-small-molecule translation within 3D pockets, offering a new avenue for targeted protein binding and peptidomimetic design.

Abstract

Structure-based drug design has seen significant advancements with the integration of artificial intelligence (AI), particularly in the generation of hit and lead compounds. However, most AI-driven approaches neglect the importance of endogenous protein interactions with peptides, which may result in suboptimal molecule designs. In this work, we present Peptide2Mol, an E(3)-equivariant graph neural network diffusion model that generates small molecules by referencing both the original peptide binders and their surrounding protein pocket environments. Trained on large datasets and leveraging sophisticated modeling techniques, Peptide2Mol not only achieves state-of-the-art performance in non-autoregressive generative tasks, but also produces molecules with similarity to the original peptide binder. Additionally, the model allows for molecule optimization and peptidomimetic design through a partial diffusion process. Our results highlight Peptide2Mol as an effective deep generative model for generating and optimizing bioactive small molecules from protein binding pockets.

Peptide2Mol: A Diffusion Model for Generating Small Molecules as Peptide Mimics for Targeted Protein Binding

TL;DR

Peptide2Mol addresses the gap between peptide binders and drug-like small molecules by learning a trajectory from peptide–binding interfaces to pocket-fitting small-molecule mimics using an -equivariant diffusion model. It trains on diverse data—small molecules, protein–ligand complexes, and peptide–protein interfaces—and generates non-autoregressive, pocket-aware molecules that preserve peptide-like interactions while achieving drug-like properties. The method supports partial diffusion for peptidomimetic optimization and can refine results with Pocket2Mol to improve docking plausibility, with results showing competitive property metrics and residue-level mimicry validated by PMI analyses. This work advances structure-based design by enabling peptide-to-small-molecule translation within 3D pockets, offering a new avenue for targeted protein binding and peptidomimetic design.

Abstract

Structure-based drug design has seen significant advancements with the integration of artificial intelligence (AI), particularly in the generation of hit and lead compounds. However, most AI-driven approaches neglect the importance of endogenous protein interactions with peptides, which may result in suboptimal molecule designs. In this work, we present Peptide2Mol, an E(3)-equivariant graph neural network diffusion model that generates small molecules by referencing both the original peptide binders and their surrounding protein pocket environments. Trained on large datasets and leveraging sophisticated modeling techniques, Peptide2Mol not only achieves state-of-the-art performance in non-autoregressive generative tasks, but also produces molecules with similarity to the original peptide binder. Additionally, the model allows for molecule optimization and peptidomimetic design through a partial diffusion process. Our results highlight Peptide2Mol as an effective deep generative model for generating and optimizing bioactive small molecules from protein binding pockets.

Paper Structure

This paper contains 14 sections, 5 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of the Peptide2Mol model. (a) Dataset composition, training, and inference workflow. The model is trained on peptide and small molecule structures, with inference generating candidate ligands for target protein pockets. (b) Schematic representation of edge for non-covalent interactions between ligands and the protein pocket. (c) Model architecture of Peptide2Mol.
  • Figure 2: Waterfall diagram illustrating the stepwise evaluation of AI-generated molecules against the PoseBusters criteria. Each method was designed to generate 100 molecules per target across the testset targets. Panels show results for LiGAN (a), Pocket2Mol (b), TargetDiff (c), PocketFlow (d), Peptide2Mol (e), and Peptide2Mol-Fixed (f).
  • Figure 3: Geometric and property-based evaluation of generated molecules.(a–i) Bond length distributions of molecules generated by different AI-based methods compared with those in the training set. Nine representative bond types are analyzed: C–C (a), C=C (b), C–O (c), C=O (d), C–N (e), C=N (f), C–Cl (g), C–S (h), and C–F (i).
  • Figure 4: The histogram to show the top replacement fragment from small molecules with 4 representative residues (ARG, ASP, LEU and TYR), the color reflects the composition proportion of elements (green: Carbon, Blue: Nitrogen, Red: Oxygen, Gray, others)
  • Figure 5: Representative examples showing that Peptide2Mol can transform (a) a peptide binder (PDB ID: 7WXO) and (b) an antibody CDR (PDB ID: 3NGB) into corresponding small molecules that mimic their binding interfaces.