Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

Ziyi Yang; Zitong Tian; Yinjun Jia; Tianyi Zhang; Jiqing Zheng; Hao Wang; Yubu Su; Juncai He; Lei Liu; Yanyan Lan

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

Ziyi Yang, Zitong Tian, Yinjun Jia, Tianyi Zhang, Jiqing Zheng, Hao Wang, Yubu Su, Juncai He, Lei Liu, Yanyan Lan

TL;DR

This work shows that by injecting axial features to E(3)$-equivariant (polar) vector features, it is feasible to achieve cross-chirality generalization from homo-chiral training data to hetero-chiral design tasks.

Abstract

D-peptide binders targeting L-proteins have promising therapeutic potential. Despite rapid advances in machine learning-based target-conditioned peptide design, generating D-peptide binders remains largely unexplored. In this work, we show that by injecting axial features to $E(3)$-equivariant (polar) vector features,it is feasible to achieve cross-chirality generalization from homo-chiral (L--L) training data to hetero-chiral (D--L) design tasks. By implementing this method within a latent diffusion model, we achieved D-peptide binder design that not only outperforms existing tools in in silico benchmarks, but also demonstrates efficacy in wet-lab validation. To our knowledge, our approach represents the first wet-lab validated generative AI for the de novo design of D-peptide binders, offering new perspectives on handling chirality in protein design.

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

TL;DR

Abstract

-equivariant (polar) vector features,it is feasible to achieve cross-chirality generalization from homo-chiral (L--L) training data to hetero-chiral (D--L) design tasks. By implementing this method within a latent diffusion model, we achieved D-peptide binder design that not only outperforms existing tools in in silico benchmarks, but also demonstrates efficacy in wet-lab validation. To our knowledge, our approach represents the first wet-lab validated generative AI for the de novo design of D-peptide binders, offering new perspectives on handling chirality in protein design.

Paper Structure (50 sections, 9 theorems, 102 equations, 12 figures, 8 tables, 1 algorithm)

This paper contains 50 sections, 9 theorems, 102 equations, 12 figures, 8 tables, 1 algorithm.

Introduction
Related Works
Chirality in protein models
Chirality aware models in geometric learning
Representation theory and chiral features
Design D-peptides as L-protein binders
Method
Preliminary
Hetero-chiral design as a zero-shot generalization
Chirality awareness by introducing axial vectors
Stable representations for enantiomer structures
Encoding stability
Diffusion stability
Implementation of axial feature injection for D-peptide binder design
Experiment
...and 35 more sections

Key Result

Theorem 3.1

For a sample of amino acid $X$, under some mild assumptions, we have for any $\varepsilon\in(0,1)$, with probability at least $1 - \delta_W(\varepsilon)$, where $c_W$ is a constant.

Figures (12)

Figure 1: We use chiral-sensitive model PepMirror to design D-peptides by flipping. PepMirror is a latent diffusion model using AFI-EPT, which injects axial vector features to learn the chirality. Axial vectors are invariant under the spatial inversion, and we give three direct constructions. The commutator feature (third) captures higher frequency information of the angle between $u,v$.
Figure 2: The max-pooled pair-wise Tanimoto shape similarity between L/D amino acids. The similarities between an amino acid and its enantiomer are among the highest compared with similaries between different amino acids. Because and similarities between the same amino acid or between "D-Gly" and "L-Gly" are 1.0 by definition, we excluded these entries in the heatmap.
Figure 3: Latent-code distances between each amino acid and its inverted counterpart ($X$ vs. $-X$) for L- and D-forms across different encoder variants. All and Peptide refer to UniMoMo without AFI trained on different datasets (see section \ref{['sec:experiment']}), and the other three are equipped with AFI based on different axial features. Distances are summarized as box plots over all amino acids. Encoders equipped with AFI exhibit a non-negligible inversion-induced discrepancy. The number on every block is the median distance.
Figure 4: Left: t-SNE of 20 types of amino acids including both L and D chirality. As t-SNE cannot keep distance, we plot the heatmap (right) of mean pairwise latent-code distances among 40 amino-acid classes including 20 L amino acids and 20 D amino acids. The three diagonals are two orders of magnitude smaller than the off-diagonal entries ($10^{-2}$ vs $1$), indicating tight within-class clustering and clear inter-class separation. The model we use here is PepMirror (cross), for other models, see Fig. \ref{['fig:SI-heatmap-cross']}, Fig. \ref{['fig:SI-heatmap-triple']}, and Fig. \ref{['fig:SI-heatmap-commutator']}
Figure 5: The identified D-peptide binder against CD38. Left: Complex structure of D-1412 and CD38 generated by PepMirror (cross), where multiple interactions can be identified. Middle: Stacked curves of association and dissociation under different concentrations with kinetic fitting. Right: Steady state fitting of the max response for each concentration, the blue line is the observed KD.
...and 7 more figures

Theorems & Definitions (19)

Theorem 3.1: Chirality awareness, informal
Proposition 3.2: No discrepancy without AFI
proof
Theorem 3.4: Conditional diffusion is stable in $W_2$
Proposition 3.5
Remark 2.1
proof : proof of Proposition \ref{['prop:equivariance-of-a-v']}
Lemma 2.5: Parity flips only the polar part
proof
Lemma 2.6: Small-ball bound for a Gaussian bilinear form
...and 9 more

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

TL;DR

Abstract

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (19)