Table of Contents
Fetching ...

Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension

Jiahan Li, Tong Chen, Shitong Luo, Chaoran Cheng, Jiaqi Guan, Ruihan Guo, Sheng Wang, Ge Liu, Jian Peng, Jianzhu Ma

TL;DR

PepHAR tackles the challenge of designing target-specific peptide binders by separating residues into hot spots and scaffolds and then combining three stages: hot-spot sampling with an energy-based model, autoregressive fragment extension guided by dihedral angles, and a geometry-aware correction step. The approach leverages an $SE(3)$-invariant IPA backbone and von Mises angle modeling to maintain peptide bond geometry while constructing sequences around key hot spots. Across de novo binder design and scaffold generation benchmarks, PepHAR improves geometric validity, native-like conformations, and energy/affinity metrics, outperforming several baselines when hotspots are provided or inferred. The work also introduces a pragmatic scaffold-generation setting, providing a practical pathway toward therapeutic peptide design and highlighting the potential for integrating hot-spot knowledge into generative peptide design systems, with open-source code available at https://github.com/Ced3-han/PepHAR.

Abstract

Peptides, short chains of amino acids, interact with target proteins, making them a unique class of protein-based therapeutics for treating human diseases. Recently, deep generative models have shown great promise in peptide generation. However, several challenges remain in designing effective peptide binders. First, not all residues contribute equally to peptide-target interactions. Second, the generated peptides must adopt valid geometries due to the constraints of peptide bonds. Third, realistic tasks for peptide drug development are still lacking. To address these challenges, we introduce PepHAR, a hot-spot-driven autoregressive generative model for designing peptides targeting specific proteins. Building on the observation that certain hot spot residues have higher interaction potentials, we first use an energy-based density model to fit and sample these key residues. Next, to ensure proper peptide geometry, we autoregressively extend peptide fragments by estimating dihedral angles between residue frames. Finally, we apply an optimization process to iteratively refine fragment assembly, ensuring correct peptide structures. By combining hot spot sampling with fragment-based extension, our approach enables de novo peptide design tailored to a target protein and allows the incorporation of key hot spot residues into peptide scaffolds. Extensive experiments, including peptide design and peptide scaffold generation, demonstrate the strong potential of PepHAR in computational peptide binder design. Source code will be available at https://github.com/Ced3-han/PepHAR.

Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension

TL;DR

PepHAR tackles the challenge of designing target-specific peptide binders by separating residues into hot spots and scaffolds and then combining three stages: hot-spot sampling with an energy-based model, autoregressive fragment extension guided by dihedral angles, and a geometry-aware correction step. The approach leverages an -invariant IPA backbone and von Mises angle modeling to maintain peptide bond geometry while constructing sequences around key hot spots. Across de novo binder design and scaffold generation benchmarks, PepHAR improves geometric validity, native-like conformations, and energy/affinity metrics, outperforming several baselines when hotspots are provided or inferred. The work also introduces a pragmatic scaffold-generation setting, providing a practical pathway toward therapeutic peptide design and highlighting the potential for integrating hot-spot knowledge into generative peptide design systems, with open-source code available at https://github.com/Ced3-han/PepHAR.

Abstract

Peptides, short chains of amino acids, interact with target proteins, making them a unique class of protein-based therapeutics for treating human diseases. Recently, deep generative models have shown great promise in peptide generation. However, several challenges remain in designing effective peptide binders. First, not all residues contribute equally to peptide-target interactions. Second, the generated peptides must adopt valid geometries due to the constraints of peptide bonds. Third, realistic tasks for peptide drug development are still lacking. To address these challenges, we introduce PepHAR, a hot-spot-driven autoregressive generative model for designing peptides targeting specific proteins. Building on the observation that certain hot spot residues have higher interaction potentials, we first use an energy-based density model to fit and sample these key residues. Next, to ensure proper peptide geometry, we autoregressively extend peptide fragments by estimating dihedral angles between residue frames. Finally, we apply an optimization process to iteratively refine fragment assembly, ensuring correct peptide structures. By combining hot spot sampling with fragment-based extension, our approach enables de novo peptide design tailored to a target protein and allows the incorporation of key hot spot residues into peptide scaffolds. Extensive experiments, including peptide design and peptide scaffold generation, demonstrate the strong potential of PepHAR in computational peptide binder design. Source code will be available at https://github.com/Ced3-han/PepHAR.

Paper Structure

This paper contains 63 sections, 42 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: Left: Hot spot residues are a small number of critical residues in the peptide-target binding interface, while the remaining residues act as scaffolds. In peptide design, we first sample the hot spots and then use scaffold residues to link them. Right: Each residue in the protein consists of backbone heavy atoms and side-chain groups. Adjacent residues are connected by peptide bonds, which establish a planar conformation around neighboring atoms. The backbone structure of adjacent residues can be reconstructed using dihedral angles through the operations Left and Right.
  • Figure 2: Overview of our three-stage approach: In the first foudning stage, $k$ hot-spot residues are generated ($k=2$ in this example) from learned residue distribution around the target. New residues are extended to the fragments' left or right in the second stage based on dihedral angle distributions. Finally, in the correction stage, gradients from the objective functions are applied to refine the complete peptide.
  • Figure 3: RMSD of generated peptides, considering different tasks and numbers of hotspots. More hotspot residues lead to better results.
  • Figure 4: Two examples of generated peptides, along with RMSD and binding energy. PepHAR can generate native-like peptides with better binding affinities.
  • Figure 5: Examples of generated scaffolded peptides by PepHAR. PepHAR can scaffold hotspot residues, leading to more stable complexes with native-like valid geometries
  • ...and 2 more figures