Table of Contents
Fetching ...

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

Ziyi Yang, Zitong Tian, Yinjun Jia, Tianyi Zhang, Jiqing Zheng, Hao Wang, Yubu Su, Juncai He, Lei Liu, Yanyan Lan

TL;DR

This work shows that by injecting axial features to E(3)$-equivariant (polar) vector features, it is feasible to achieve cross-chirality generalization from homo-chiral training data to hetero-chiral design tasks.

Abstract

D-peptide binders targeting L-proteins have promising therapeutic potential. Despite rapid advances in machine learning-based target-conditioned peptide design, generating D-peptide binders remains largely unexplored. In this work, we show that by injecting axial features to $E(3)$-equivariant (polar) vector features,it is feasible to achieve cross-chirality generalization from homo-chiral (L--L) training data to hetero-chiral (D--L) design tasks. By implementing this method within a latent diffusion model, we achieved D-peptide binder design that not only outperforms existing tools in in silico benchmarks, but also demonstrates efficacy in wet-lab validation. To our knowledge, our approach represents the first wet-lab validated generative AI for the de novo design of D-peptide binders, offering new perspectives on handling chirality in protein design.

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

TL;DR

This work shows that by injecting axial features to E(3)$-equivariant (polar) vector features, it is feasible to achieve cross-chirality generalization from homo-chiral training data to hetero-chiral design tasks.

Abstract

D-peptide binders targeting L-proteins have promising therapeutic potential. Despite rapid advances in machine learning-based target-conditioned peptide design, generating D-peptide binders remains largely unexplored. In this work, we show that by injecting axial features to -equivariant (polar) vector features,it is feasible to achieve cross-chirality generalization from homo-chiral (L--L) training data to hetero-chiral (D--L) design tasks. By implementing this method within a latent diffusion model, we achieved D-peptide binder design that not only outperforms existing tools in in silico benchmarks, but also demonstrates efficacy in wet-lab validation. To our knowledge, our approach represents the first wet-lab validated generative AI for the de novo design of D-peptide binders, offering new perspectives on handling chirality in protein design.
Paper Structure (50 sections, 9 theorems, 102 equations, 12 figures, 8 tables, 1 algorithm)

This paper contains 50 sections, 9 theorems, 102 equations, 12 figures, 8 tables, 1 algorithm.

Key Result

Theorem 3.1

For a sample of amino acid $X$, under some mild assumptions, we have for any $\varepsilon\in(0,1)$, with probability at least $1 - \delta_W(\varepsilon)$, where $c_W$ is a constant.

Figures (12)

  • Figure 1: We use chiral-sensitive model PepMirror to design D-peptides by flipping. PepMirror is a latent diffusion model using AFI-EPT, which injects axial vector features to learn the chirality. Axial vectors are invariant under the spatial inversion, and we give three direct constructions. The commutator feature (third) captures higher frequency information of the angle between $u,v$.
  • Figure 2: The max-pooled pair-wise Tanimoto shape similarity between L/D amino acids. The similarities between an amino acid and its enantiomer are among the highest compared with similaries between different amino acids. Because and similarities between the same amino acid or between "D-Gly" and "L-Gly" are 1.0 by definition, we excluded these entries in the heatmap.
  • Figure 3: Latent-code distances between each amino acid and its inverted counterpart ($X$ vs. $-X$) for L- and D-forms across different encoder variants. All and Peptide refer to UniMoMo without AFI trained on different datasets (see section \ref{['sec:experiment']}), and the other three are equipped with AFI based on different axial features. Distances are summarized as box plots over all amino acids. Encoders equipped with AFI exhibit a non-negligible inversion-induced discrepancy. The number on every block is the median distance.
  • Figure 4: Left: t-SNE of 20 types of amino acids including both L and D chirality. As t-SNE cannot keep distance, we plot the heatmap (right) of mean pairwise latent-code distances among 40 amino-acid classes including 20 L amino acids and 20 D amino acids. The three diagonals are two orders of magnitude smaller than the off-diagonal entries ($10^{-2}$ vs $1$), indicating tight within-class clustering and clear inter-class separation. The model we use here is PepMirror (cross), for other models, see Fig. \ref{['fig:SI-heatmap-cross']}, Fig. \ref{['fig:SI-heatmap-triple']}, and Fig. \ref{['fig:SI-heatmap-commutator']}
  • Figure 5: The identified D-peptide binder against CD38. Left: Complex structure of D-1412 and CD38 generated by PepMirror (cross), where multiple interactions can be identified. Middle: Stacked curves of association and dissociation under different concentrations with kinetic fitting. Right: Steady state fitting of the max response for each concentration, the blue line is the observed KD.
  • ...and 7 more figures

Theorems & Definitions (19)

  • Theorem 3.1: Chirality awareness, informal
  • Proposition 3.2: No discrepancy without AFI
  • proof
  • Theorem 3.4: Conditional diffusion is stable in $W_2$
  • Proposition 3.5
  • Remark 2.1
  • proof : proof of Proposition \ref{['prop:equivariance-of-a-v']}
  • Lemma 2.5: Parity flips only the polar part
  • proof
  • Lemma 2.6: Small-ball bound for a Gaussian bilinear form
  • ...and 9 more