Table of Contents
Fetching ...

AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance

Tong Chen, Yinuo Zhang, Pranam Chatterjee

TL;DR

AReUReDi tackles discrete multi-objective biomolecule design by extending Rectified Discrete Flows with annealed Chebyshev scalarization, locally balanced proposals, and Metropolis-Hastings updates to guarantee convergence to the Pareto front while preserving distributional invariance. The framework defines $S_{\omega}(x) = \min_{1\le n\le N} \omega_n \tilde{s}_n(x)$ and $W_{\eta_t,\omega}(x) = \exp(\eta_t S_{\omega}(x))$, employing an annealing schedule on $\eta_t$ and reversible, coordinate-wise proposals to navigate trade-offs across up to five objectives in peptide and SMILES design. Theoretical results include invariance of the sampling kernel and full-coverage convergence to Pareto-optimal states, with empirical demonstrations showing superior Pareto front navigation and biologically plausible, diverse sequences compared to traditional MOO methods and diffusion baselines. The approach delivers a principled, scalable tool for multi-property biomolecule generation, enabling simultaneous optimization of potency, stability, solubility, and safety across amino-acid sequences and chemically modified peptide backbones.

Abstract

Designing sequences that satisfy multiple, often conflicting, objectives is a central challenge in therapeutic and biomolecular engineering. Existing generative frameworks largely operate in continuous spaces with single-objective guidance, while discrete approaches lack guarantees for multi-objective Pareto optimality. We introduce AReUReDi (Annealed Rectified Updates for Refining Discrete Flows), a discrete optimization algorithm with theoretical guarantees of convergence to the Pareto front. Building on Rectified Discrete Flows (ReDi), AReUReDi combines Tchebycheff scalarization, locally balanced proposals, and annealed Metropolis-Hastings updates to bias sampling toward Pareto-optimal states while preserving distributional invariance. Applied to peptide and SMILES sequence design, AReUReDi simultaneously optimizes up to five therapeutic properties (including affinity, solubility, hemolysis, half-life, and non-fouling) and outperforms both evolutionary and diffusion-based baselines. These results establish AReUReDi as a powerful, sequence-based framework for multi-property biomolecule generation.

AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance

TL;DR

AReUReDi tackles discrete multi-objective biomolecule design by extending Rectified Discrete Flows with annealed Chebyshev scalarization, locally balanced proposals, and Metropolis-Hastings updates to guarantee convergence to the Pareto front while preserving distributional invariance. The framework defines and , employing an annealing schedule on and reversible, coordinate-wise proposals to navigate trade-offs across up to five objectives in peptide and SMILES design. Theoretical results include invariance of the sampling kernel and full-coverage convergence to Pareto-optimal states, with empirical demonstrations showing superior Pareto front navigation and biologically plausible, diverse sequences compared to traditional MOO methods and diffusion baselines. The approach delivers a principled, scalable tool for multi-property biomolecule generation, enabling simultaneous optimization of potency, stability, solubility, and safety across amino-acid sequences and chemically modified peptide backbones.

Abstract

Designing sequences that satisfy multiple, often conflicting, objectives is a central challenge in therapeutic and biomolecular engineering. Existing generative frameworks largely operate in continuous spaces with single-objective guidance, while discrete approaches lack guarantees for multi-objective Pareto optimality. We introduce AReUReDi (Annealed Rectified Updates for Refining Discrete Flows), a discrete optimization algorithm with theoretical guarantees of convergence to the Pareto front. Building on Rectified Discrete Flows (ReDi), AReUReDi combines Tchebycheff scalarization, locally balanced proposals, and annealed Metropolis-Hastings updates to bias sampling toward Pareto-optimal states while preserving distributional invariance. Applied to peptide and SMILES sequence design, AReUReDi simultaneously optimizes up to five therapeutic properties (including affinity, solubility, hemolysis, half-life, and non-fouling) and outperforms both evolutionary and diffusion-based baselines. These results establish AReUReDi as a powerful, sequence-based framework for multi-property biomolecule generation.

Paper Structure

This paper contains 31 sections, 21 equations, 7 figures, 10 tables, 1 algorithm.

Figures (7)

  • Figure 1: AReUReDi. Discrete flow matching is first rectified to reduce conditional total correlation. At each timestep, candidate single-position mutations with ReDi-predicted probabilities (visualized by arrows of varying thickness) are evaluated by multiple objective functions. A locally balanced proposal is then constructed using Tchebycheff scalarization with annealed guidance strength, and the next state is selected via a Metropolis-Hastings update. This iterative process drives the generated sequences toward the Pareto front.
  • Figure 2: (A), (B) Complex structures of PDB 1B8Q with an AReUReDi-designed binder and its pre-existing binder. (C), (D) Complex structures of OX1R and EWS::FLI1 with an AReUReDi-designed binder. Five property scores are shown for each binder, along with the ipTM score from AlphaFold3 and docking score from AutoDock VINA. Interacting residues on the target are visualized. (E) Plots showing the mean scores for each property across the number of iterations during AReUReDi's design of binders of length 12-aa for EWS::FLI1. (F) A density plot illustrating the distribution of predicted property scores for AReUReDi-designed EWS::FLI1 binders of length 12-aa, compared to the peptides generated unconditionally by PepReDi$^3$.
  • Figure 3: (A) Example 2D SMILES structure of AReUReDi-designed peptide binders with four property scores. (B) Plots showing the mean scores for each property across the number of iterations during AReUReDi's design of binders of length 200 for NCAM1.
  • Figure S1: Complex structures of target proteins with pre-existing binders.(A)-(B) 5AZ8 (C)-(D) 7JVS. Each panel shows the complex structure of the target with either an AReUReDi-designed binder or its pre-existing binder. For each binder, five property scores are provided, as well as the ipTM score from AlphaFold3 and the docking score from AutoDock VINA. Interacting residues on the target are visualized.
  • Figure S2: Complex structures of target proteins without pre-existing binders.(A)-(C) AMHR2, (D)-(E) EWS::FLI1, (F) MYC, (G) DUSP12. Each panel shows the complex structure of the target with an AReUReDi-designed binder. For each binder, five property scores are provided, as well as the ipTM score from AlphaFold3 and the docking score from AutoDock VINA. Interacting residues on the target are visualized.
  • ...and 2 more figures

Theorems & Definitions (4)

  • proof
  • proof
  • proof
  • proof