PepCompass: Navigating peptide embedding spaces using Riemannian Geometry
Marcin Możejko, Adam Bielecki, Jurand Prądzyński, Marcin Traskowski, Antoni Janowski, Hyun-Su Lee, Marcelo Der Torossian Torres, Michał Kmicikiewicz, Paulina Szymczak, Karol Jurasz, Michał Kucharczyk, Cesar de la Fuente-Nunez, Ewa Szczurek
TL;DR
PepCompass tackles the immense peptide design space by modeling decoder-induced geometry as a union of $κ$-stable Riemannian submanifolds, enabling distortion-free navigation of peptide embeddings. It introduces two local exploration methods, SORBES and MUTANG, and integrates them into Local Enumeration Bayesian Optimization (LE-BO) for efficient, geometry-aware optimization, alongside PoGS for global seed discovery via potential-augmented geodesics. The framework is validated in silico and in vitro, showing PoGS yields high-quality seeds and LE-BO discovers numerous highly active peptides with broad-spectrum activity, including against multidrug-resistant strains. Wet-lab experiments report unprecedented success rates (100% at $\text{MIC} \leq 32\,\mu$g/mL) for PepCompass-generated peptides, underscoring the practical impact of geometry-informed design in antimicrobial peptide discovery and optimization.
Abstract
Antimicrobial peptide discovery is challenged by the astronomical size of peptide space and the relative scarcity of active peptides. Generative models provide continuous latent "maps" of peptide space, but conventionally ignore decoder-induced geometry and rely on flat Euclidean metrics, rendering exploration and optimization distorted and inefficient. Prior manifold-based remedies assume fixed intrinsic dimensionality, which critically fails in practice for peptide data. Here, we introduce PepCompass, a geometry-aware framework for peptide exploration and optimization. At its core, we define a Union of $κ$-Stable Riemannian Manifolds $\mathbb{M}^κ$, a family of decoder-induced manifolds that captures local geometry while ensuring computational stability. We propose two local exploration methods: Second-Order Riemannian Brownian Efficient Sampling, which provides a convergent second-order approximation to Riemannian Brownian motion, and Mutation Enumeration in Tangent Space, which reinterprets tangent directions as discrete amino-acid substitutions. Combining these yields Local Enumeration Bayesian Optimization (LE-BO), an efficient algorithm for local activity optimization. Finally, we introduce Potential-minimizing Geodesic Search (PoGS), which interpolates between prototype embeddings along property-enriched geodesics, biasing discovery toward seeds, i.e. peptides with favorable activity. In-vitro validation confirms the effectiveness of PepCompass: PoGS yields four novel seeds, and subsequent optimization with LE-BO discovers 25 highly active peptides with broad-spectrum activity, including against resistant bacterial strains. These results demonstrate that geometry-informed exploration provides a powerful new paradigm for antimicrobial peptide design.
