Table of Contents
Fetching ...

PepCompass: Navigating peptide embedding spaces using Riemannian Geometry

Marcin Możejko, Adam Bielecki, Jurand Prądzyński, Marcin Traskowski, Antoni Janowski, Hyun-Su Lee, Marcelo Der Torossian Torres, Michał Kmicikiewicz, Paulina Szymczak, Karol Jurasz, Michał Kucharczyk, Cesar de la Fuente-Nunez, Ewa Szczurek

TL;DR

PepCompass tackles the immense peptide design space by modeling decoder-induced geometry as a union of $κ$-stable Riemannian submanifolds, enabling distortion-free navigation of peptide embeddings. It introduces two local exploration methods, SORBES and MUTANG, and integrates them into Local Enumeration Bayesian Optimization (LE-BO) for efficient, geometry-aware optimization, alongside PoGS for global seed discovery via potential-augmented geodesics. The framework is validated in silico and in vitro, showing PoGS yields high-quality seeds and LE-BO discovers numerous highly active peptides with broad-spectrum activity, including against multidrug-resistant strains. Wet-lab experiments report unprecedented success rates (100% at $\text{MIC} \leq 32\,\mu$g/mL) for PepCompass-generated peptides, underscoring the practical impact of geometry-informed design in antimicrobial peptide discovery and optimization.

Abstract

Antimicrobial peptide discovery is challenged by the astronomical size of peptide space and the relative scarcity of active peptides. Generative models provide continuous latent "maps" of peptide space, but conventionally ignore decoder-induced geometry and rely on flat Euclidean metrics, rendering exploration and optimization distorted and inefficient. Prior manifold-based remedies assume fixed intrinsic dimensionality, which critically fails in practice for peptide data. Here, we introduce PepCompass, a geometry-aware framework for peptide exploration and optimization. At its core, we define a Union of $κ$-Stable Riemannian Manifolds $\mathbb{M}^κ$, a family of decoder-induced manifolds that captures local geometry while ensuring computational stability. We propose two local exploration methods: Second-Order Riemannian Brownian Efficient Sampling, which provides a convergent second-order approximation to Riemannian Brownian motion, and Mutation Enumeration in Tangent Space, which reinterprets tangent directions as discrete amino-acid substitutions. Combining these yields Local Enumeration Bayesian Optimization (LE-BO), an efficient algorithm for local activity optimization. Finally, we introduce Potential-minimizing Geodesic Search (PoGS), which interpolates between prototype embeddings along property-enriched geodesics, biasing discovery toward seeds, i.e. peptides with favorable activity. In-vitro validation confirms the effectiveness of PepCompass: PoGS yields four novel seeds, and subsequent optimization with LE-BO discovers 25 highly active peptides with broad-spectrum activity, including against resistant bacterial strains. These results demonstrate that geometry-informed exploration provides a powerful new paradigm for antimicrobial peptide design.

PepCompass: Navigating peptide embedding spaces using Riemannian Geometry

TL;DR

PepCompass tackles the immense peptide design space by modeling decoder-induced geometry as a union of -stable Riemannian submanifolds, enabling distortion-free navigation of peptide embeddings. It introduces two local exploration methods, SORBES and MUTANG, and integrates them into Local Enumeration Bayesian Optimization (LE-BO) for efficient, geometry-aware optimization, alongside PoGS for global seed discovery via potential-augmented geodesics. The framework is validated in silico and in vitro, showing PoGS yields high-quality seeds and LE-BO discovers numerous highly active peptides with broad-spectrum activity, including against multidrug-resistant strains. Wet-lab experiments report unprecedented success rates (100% at g/mL) for PepCompass-generated peptides, underscoring the practical impact of geometry-informed design in antimicrobial peptide discovery and optimization.

Abstract

Antimicrobial peptide discovery is challenged by the astronomical size of peptide space and the relative scarcity of active peptides. Generative models provide continuous latent "maps" of peptide space, but conventionally ignore decoder-induced geometry and rely on flat Euclidean metrics, rendering exploration and optimization distorted and inefficient. Prior manifold-based remedies assume fixed intrinsic dimensionality, which critically fails in practice for peptide data. Here, we introduce PepCompass, a geometry-aware framework for peptide exploration and optimization. At its core, we define a Union of -Stable Riemannian Manifolds , a family of decoder-induced manifolds that captures local geometry while ensuring computational stability. We propose two local exploration methods: Second-Order Riemannian Brownian Efficient Sampling, which provides a convergent second-order approximation to Riemannian Brownian motion, and Mutation Enumeration in Tangent Space, which reinterprets tangent directions as discrete amino-acid substitutions. Combining these yields Local Enumeration Bayesian Optimization (LE-BO), an efficient algorithm for local activity optimization. Finally, we introduce Potential-minimizing Geodesic Search (PoGS), which interpolates between prototype embeddings along property-enriched geodesics, biasing discovery toward seeds, i.e. peptides with favorable activity. In-vitro validation confirms the effectiveness of PepCompass: PoGS yields four novel seeds, and subsequent optimization with LE-BO discovers 25 highly active peptides with broad-spectrum activity, including against resistant bacterial strains. These results demonstrate that geometry-informed exploration provides a powerful new paradigm for antimicrobial peptide design.

Paper Structure

This paper contains 47 sections, 3 theorems, 76 equations, 9 figures, 2 tables, 6 algorithms.

Key Result

Theorem 1

Let $(Z_{i}^{\epsilon})_{i\geq 0}$ be the sequence produced by Algorithm alg:SORBES, for $M_{z}^{\kappa}(\alpha)$ with $\alpha \in (0,1)$ and diffusion horizon $T>0$, and define its continuous-time interpolation Let $R^{z}_{\kappa} = d_{M_{z}^{\kappa}}(z, (W_{z}^{\kappa})^{c})$, and suppose $L \geq 1$ satisfies Then for $T < \tfrac{(R_{z}^{\kappa})^{2}}{4k_{z}^{\kappa}L}$, as $\epsilon \to 0$, t

Figures (9)

  • Figure 1: PepCompass overview.
  • Figure 2: Tangent space as mutation space and local enumeration.A) Around an example peptide GTP we consider two orthogonal peptide-space tangent directions $\Delta\operatorname{Dec}^{(j)}$ obtained from the SVD of the decoder Jacobian at the peptide code. Each direction suggests a specific substitution: $\texttt{T}_{1}\!\to\!\texttt{K}$ and $\texttt{P}_{2}\!\to\!\texttt{C}$. B) Each $\Delta \operatorname{Dec}^{(j)}$ is reshaped into an $L\times A$ map (rows: amino acids; columns: positions). C) Thresholded entries define per-position sets of admissible residues (identity always included). D) The candidate set is the Cartesian product $\mathcal{C}(\texttt{GTP})=\prod_{\ell} S_\ell$.
  • Figure 3: Antimicrobial peptide success rates across MIC thresholds. Success rate is defined as the fraction of generated peptides with MIC below the specified threshold against at least one tested strain. Results are based on experimental validation against 19 bacterial strains, including 8 MDR isolates.
  • Figure 4: Stable rank and peptide statistics.A–B) Distributions of the $\kappa$-stable dimension ($\kappa=10^{-8}$) for HydrAMP (A) and PepCVAE (B) across sampled peptides. C–D) Scatter plots of peptide length versus $\kappa$-stable dimension for HydrAMP (C) and PepCVAE (D), revealing a clear positive correlation: longer peptides tend to yield higher stable dimensions.
  • Figure 5: Example of a local non-stability of a $\kappa$-stable dimension. A function $f(x) = x^{3} + \kappa x$ has a stable rank $k^{\kappa}$ equal to 1 everywhere except of $0$. So in every neighbourhood of $0$, a stable rank is different than $0$, thus preventing the application of a Frobenious theorem.
  • ...and 4 more figures

Theorems & Definitions (5)

  • Theorem 1
  • Theorem : Main Theorem
  • proof
  • Lemma 1: Exit-time bound
  • proof