Table of Contents
Fetching ...

Full-Atom Peptide Design via Riemannian-Euclidean Bayesian Flow Networks

Hao Qian, Shikui Tu, Lei Xu

TL;DR

PepBFN introduces a first-of-its-kind Bayesian Flow Network for full-atom peptide design that operates in fully continuous parameter space. It combines a Gaussian mixture-based BFN for side-chain angles, a Matrix Fisher-based BFN for residue orientations on $SO(3)$, and Euclidean/categorical BFNs for centroids and residue types, all integrated by an SE(3)-aware neural network. The framework enables smooth Bayesian updates, addressing the discrete-continuous mismatch and multimodal rotamer distributions that limit prior methods. Across side-chain packing, reverse folding, and sequence-structure co-design benchmarks, PepBFN achieves state-of-the-art performance with faster convergence, better stability, and richer diversity, illustrating the practical potential for computational peptide design. The modular, principled approach promises broad applicability to docking, loop modeling, and scaffold generation, paving the way for more efficient and versatile peptide engineering.

Abstract

Diffusion and flow matching models have recently emerged as promising approaches for peptide binder design. Despite their progress, these models still face two major challenges. First, categorical sampling of discrete residue types collapses their continuous parameters into onehot assignments, while continuous variables (e.g., atom positions) evolve smoothly throughout the generation process. This mismatch disrupts the update dynamics and results in suboptimal performance. Second, current models assume unimodal distributions for side-chain torsion angles, which conflicts with the inherently multimodal nature of side chain rotameric states and limits prediction accuracy. To address these limitations, we introduce PepBFN, the first Bayesian flow network for full atom peptide design that directly models parameter distributions in fully continuous space. Specifically, PepBFN models discrete residue types by learning their continuous parameter distributions, enabling joint and smooth Bayesian updates with other continuous structural parameters. It further employs a novel Gaussian mixture based Bayesian flow to capture the multimodal side chain rotameric states and a Matrix Fisher based Riemannian flow to directly model residue orientations on the $\mathrm{SO}(3)$ manifold. Together, these parameter distributions are progressively refined via Bayesian updates, yielding smooth and coherent peptide generation. Experiments on side chain packing, reverse folding, and binder design tasks demonstrate the strong potential of PepBFN in computational peptide design.

Full-Atom Peptide Design via Riemannian-Euclidean Bayesian Flow Networks

TL;DR

PepBFN introduces a first-of-its-kind Bayesian Flow Network for full-atom peptide design that operates in fully continuous parameter space. It combines a Gaussian mixture-based BFN for side-chain angles, a Matrix Fisher-based BFN for residue orientations on , and Euclidean/categorical BFNs for centroids and residue types, all integrated by an SE(3)-aware neural network. The framework enables smooth Bayesian updates, addressing the discrete-continuous mismatch and multimodal rotamer distributions that limit prior methods. Across side-chain packing, reverse folding, and sequence-structure co-design benchmarks, PepBFN achieves state-of-the-art performance with faster convergence, better stability, and richer diversity, illustrating the practical potential for computational peptide design. The modular, principled approach promises broad applicability to docking, loop modeling, and scaffold generation, paving the way for more efficient and versatile peptide engineering.

Abstract

Diffusion and flow matching models have recently emerged as promising approaches for peptide binder design. Despite their progress, these models still face two major challenges. First, categorical sampling of discrete residue types collapses their continuous parameters into onehot assignments, while continuous variables (e.g., atom positions) evolve smoothly throughout the generation process. This mismatch disrupts the update dynamics and results in suboptimal performance. Second, current models assume unimodal distributions for side-chain torsion angles, which conflicts with the inherently multimodal nature of side chain rotameric states and limits prediction accuracy. To address these limitations, we introduce PepBFN, the first Bayesian flow network for full atom peptide design that directly models parameter distributions in fully continuous space. Specifically, PepBFN models discrete residue types by learning their continuous parameter distributions, enabling joint and smooth Bayesian updates with other continuous structural parameters. It further employs a novel Gaussian mixture based Bayesian flow to capture the multimodal side chain rotameric states and a Matrix Fisher based Riemannian flow to directly model residue orientations on the manifold. Together, these parameter distributions are progressively refined via Bayesian updates, yielding smooth and coherent peptide generation. Experiments on side chain packing, reverse folding, and binder design tasks demonstrate the strong potential of PepBFN in computational peptide design.

Paper Structure

This paper contains 48 sections, 5 theorems, 73 equations, 11 figures, 8 tables, 4 algorithms.

Key Result

Lemma 4.1

Let the prior $p(x)$ be a Gaussian mixture distribution and the likelihood $p(y \mid x)$ be a single Gaussian distribution. By Bayes’ rule, the posterior $p(x \mid y)$ retains the Gaussian mixture form.

Figures (11)

  • Figure 1: The overview of PepBFN.
  • Figure 2: Distribution of peptide torus angles.
  • Figure 3: Two examples of PepBFN-generated peptides with improved binding affinities. Top row: native peptides; bottom row: peptides generated by our method.
  • Figure 4: (a) Sequence stability of peptides during generation in peptide binder design task, measured by the fraction of sequences that no longer change. (b) Trajectories of Gaussian mixture component means for side-chain torsion angles.
  • Figure 5: $a(\lambda)$ vs. $\lambda$
  • ...and 6 more figures

Theorems & Definitions (10)

  • Lemma 4.1: Conjugacy of a Gaussian Mixture Prior with a Gaussian Likelihood
  • Proposition 4.2: Bayesian Flow for Gaussian Mixture Distribution
  • Proposition 4.3: Time-Dependent Linear Decrease of the Expected Entropy Upper Bound
  • Lemma 4.4: Conjugacy of Matrix Fisher Distributions
  • Proposition 4.5: Bayesian Flow for Matrix Fisher Distribution
  • proof
  • proof
  • proof
  • proof
  • proof