Table of Contents
Fetching ...

Generative design and validation of therapeutic peptides for glioblastoma based on a potential target ATP5A

Hao Qian, Pu You, Lin Zeng, Jingyuan Zhou, Dengdeng Huang, Kaicheng Li, Shikui Tu, Lei Xu

TL;DR

This framework introduces the first lead-conditioned generative model, which focuses exploration on geometrically relevant regions around lead peptides and mitigates the combinatorial complexity of de novo methods.

Abstract

Glioblastoma (GBM) remains the most aggressive tumor, urgently requiring novel therapeutic strategies. Here, we present a dry-to-wet framework combining generative modeling and experimental validation to optimize peptides targeting ATP5A, a potential peptide-binding protein for GBM. Our framework introduces the first lead-conditioned generative model, which focuses exploration on geometrically relevant regions around lead peptides and mitigates the combinatorial complexity of de novo methods. Specifically, we propose POTFlow, a \underline{P}rior and \underline{O}ptimal \underline{T}ransport-based \underline{Flow}-matching model for peptide optimization. POTFlow employs secondary structure information (e.g., helix, sheet, loop) as geometric constraints, which are further refined by optimal transport to produce shorter flow paths. With this design, our method achieves state-of-the-art performance compared with five popular approaches. When applied to GBM, our method generates peptides that selectively inhibit cell viability and significantly prolong survival in a patient-derived xenograft (PDX) model. As the first lead peptide-conditioned flow matching model, POTFlow holds strong potential as a generalizable framework for therapeutic peptide design.

Generative design and validation of therapeutic peptides for glioblastoma based on a potential target ATP5A

TL;DR

This framework introduces the first lead-conditioned generative model, which focuses exploration on geometrically relevant regions around lead peptides and mitigates the combinatorial complexity of de novo methods.

Abstract

Glioblastoma (GBM) remains the most aggressive tumor, urgently requiring novel therapeutic strategies. Here, we present a dry-to-wet framework combining generative modeling and experimental validation to optimize peptides targeting ATP5A, a potential peptide-binding protein for GBM. Our framework introduces the first lead-conditioned generative model, which focuses exploration on geometrically relevant regions around lead peptides and mitigates the combinatorial complexity of de novo methods. Specifically, we propose POTFlow, a \underline{P}rior and \underline{O}ptimal \underline{T}ransport-based \underline{Flow}-matching model for peptide optimization. POTFlow employs secondary structure information (e.g., helix, sheet, loop) as geometric constraints, which are further refined by optimal transport to produce shorter flow paths. With this design, our method achieves state-of-the-art performance compared with five popular approaches. When applied to GBM, our method generates peptides that selectively inhibit cell viability and significantly prolong survival in a patient-derived xenograft (PDX) model. As the first lead peptide-conditioned flow matching model, POTFlow holds strong potential as a generalizable framework for therapeutic peptide design.

Paper Structure

This paper contains 20 sections, 5 theorems, 56 equations, 9 figures, 4 tables, 2 algorithms.

Key Result

Proposition 1

The sampling scheme in Definition def:ss is rotation equivariant, meaning that for any rotation matrix $R \in SO(3)$, if we rotate the original data points by $R$, the newly sampled points also rotate by $R$ accordingly.

Figures (9)

  • Figure 1: Schematic workflow from in silicon peptide design to experimental validation. a) We start from a lead peptide sequence and 3D structure of its target protein. b) POTFlow efficiently samples peptide–protein complexes within the lead peptide-conditioned space. c) An expert system clusters candidate peptides, ranks their binding affinities, and analyzes intermolecular interactions (e.g., hydrogen bonds, water bridges, $\pi$–stacking). d) Promising candidates are synthesized and tested by cell‐viability assays and patient-derived xenograft (PDX) models.
  • Figure 1: Comparison between global initialization (left) and class-specific initialization (right). The black arrows indicate the flow velocity vectors. As proved in Supplementary Notes \ref{['Improved_Initialization']}, the flow trajectories in the right diagram are shorter than those in the left diagram.
  • Figure 2: a) An overview of POTFlow. First, class-specific centroids are computed from lead peptide structures. Here, "class" represents the peptide folding type (e.g., helix, sheet, loop). Next, based on the optimal transport theory, multimodal couplings between peptides and initial noise variables are established. Finally, short disentangled paths are built and lead-conditioned flow matching model generates high-affinity complex structures. b) Corresponding residue-level illustration of how POTFlow constructs more efficient generation trajectories via lead-conditioned initialization. c) Structural definition of a residue unit. d) Computational workflow of our model at $t$ time step.
  • Figure 2: A toy example using 2D points to illustrate the different flow trajectories before (left) and after (right) applying the optimal transport plan.
  • Figure 3: a) RMSD values on different number of secondary structures across four models. The RMSD values are computed between generated peptides and the lead peptides in the test set. b) Ramachandran plot of POTFlow generated and lead peptides. c) Visualization of the ATP5A subunit. The red box indicates the experimentally validated peptide-binding site used as the input for subsequent generative modeling. d-e) Detailed non-covalent interactions of two generated candidates with protein ATP5A, as identified by the Protein–Ligand Interaction Profiler (PLIP) salentin2015plip. Hydrogen bonds are shown as blue solid lines, hydrophobic contacts as red dashed lines, and salt bridges as green dashed lines.
  • ...and 4 more figures

Theorems & Definitions (10)

  • Definition 1
  • Proposition 1: Rotation Equivariance
  • Proposition 2: Improved Initialization via Class-Specific Centroids
  • Theorem 3
  • proof
  • proof
  • Lemma 1
  • proof : Proof of Lemma
  • proof
  • Corollary 3.1: Efficiency of Flow Matching