Table of Contents
Fetching ...

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

Angxiao Yue, Zichong Wang, Hongteng Xu

TL;DR

ReQFlow introduces a rectified quaternion flow matching approach for protein backbone generation that decouples residue translations and rotations, representing rotations with unit quaternions and performing SLERP-based interpolation in exponential format. By rectifying the learned QFlow, ReQFlow achieves non-crossing sampling paths, preserves marginal distributions, and reduces inference steps, leading to substantial speedups without sacrificing designability. Empirical results on PDB and SCOPe show state-of-the-art designability while delivering up to ~37× speedups over RFDiffusion and ~63× over Genie2, with strong performance on long-chain proteins and robust generalization. The work emphasizes numerical stability benefits of quaternion-based rotation interpolation and demonstrates the broader applicability of flow rectification in SO(3) across protein-design tasks. These advances offer practical impact for large-scale de novo protein design where both quality and efficiency are critical.

Abstract

Protein backbone generation plays a central role in de novo protein design and is significant for many biological and medical applications. Although diffusion and flow-based generative models provide potential solutions to this challenging task, they often generate proteins with undesired designability and suffer computational inefficiency. In this study, we propose a novel rectified quaternion flow (ReQFlow) matching method for fast and high-quality protein backbone generation. In particular, our method generates a local translation and a 3D rotation from random noise for each residue in a protein chain, which represents each 3D rotation as a unit quaternion and constructs its flow by spherical linear interpolation (SLERP) in an exponential format. We train the model by quaternion flow (QFlow) matching with guaranteed numerical stability and rectify the QFlow model to accelerate its inference and improve the designability of generated protein backbones, leading to the proposed ReQFlow model. Experiments show that ReQFlow achieves on-par performance in protein backbone generation while requiring much fewer sampling steps and significantly less inference time (e.g., being 37x faster than RFDiffusion and 63x faster than Genie2 when generating a backbone of length 300), demonstrating its effectiveness and efficiency. The code is available at https://github.com/AngxiaoYue/ReQFlow.

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

TL;DR

ReQFlow introduces a rectified quaternion flow matching approach for protein backbone generation that decouples residue translations and rotations, representing rotations with unit quaternions and performing SLERP-based interpolation in exponential format. By rectifying the learned QFlow, ReQFlow achieves non-crossing sampling paths, preserves marginal distributions, and reduces inference steps, leading to substantial speedups without sacrificing designability. Empirical results on PDB and SCOPe show state-of-the-art designability while delivering up to ~37× speedups over RFDiffusion and ~63× over Genie2, with strong performance on long-chain proteins and robust generalization. The work emphasizes numerical stability benefits of quaternion-based rotation interpolation and demonstrates the broader applicability of flow rectification in SO(3) across protein-design tasks. These advances offer practical impact for large-scale de novo protein design where both quality and efficiency are critical.

Abstract

Protein backbone generation plays a central role in de novo protein design and is significant for many biological and medical applications. Although diffusion and flow-based generative models provide potential solutions to this challenging task, they often generate proteins with undesired designability and suffer computational inefficiency. In this study, we propose a novel rectified quaternion flow (ReQFlow) matching method for fast and high-quality protein backbone generation. In particular, our method generates a local translation and a 3D rotation from random noise for each residue in a protein chain, which represents each 3D rotation as a unit quaternion and constructs its flow by spherical linear interpolation (SLERP) in an exponential format. We train the model by quaternion flow (QFlow) matching with guaranteed numerical stability and rectify the QFlow model to accelerate its inference and improve the designability of generated protein backbones, leading to the proposed ReQFlow model. Experiments show that ReQFlow achieves on-par performance in protein backbone generation while requiring much fewer sampling steps and significantly less inference time (e.g., being 37x faster than RFDiffusion and 63x faster than Genie2 when generating a backbone of length 300), demonstrating its effectiveness and efficiency. The code is available at https://github.com/AngxiaoYue/ReQFlow.

Paper Structure

This paper contains 34 sections, 4 theorems, 53 equations, 9 figures, 10 tables, 3 algorithms.

Key Result

Theorem 3.1

(Marginal preserving property). The pair $(\bm{q}_0^{\prime}, \bm{q}_1^{\prime})$ is a coupling of $\mathcal{Q}_0$ and $\mathcal{Q}_1$. The marginal law of $\bm{q}_t^{\prime}$ equals that of $\bm{q}_t$ at everytime, that is $\text{Law}(\bm{q}_t^{\prime}) = \text{Law}(\bm{q}_t)$.

Figures (9)

  • Figure 1: (a) An illustration of our rectified quaternion flow matching method, in which each residue is represented as a frame associated with a local transformation. (b) For each method, the size of its circle indicates the model size, and the location of the circle's centroid indicates the logarithm of the average inference time when generating a protein backbone with length $N=300$ and the Fraction score of designable protein backbones. For QFlow and ReQFlow, we set the sampling step $L \in\{20, 50, 500\}$, respectively.
  • Figure 2: (a) Mean round-trip errors from $\pi - 10^{-1}$ to $\pi - 10^{-7}$. (b) The frequency of suffering large rotation angles per protein when training on the two datasets. (c) The average number of small rotation angles per protein when generating ten backbones for each length.
  • Figure 3: The distribution of protein backbones with respect to the percentages of their secondary structure.
  • Figure 4: The comparison for various methods on the designability of generated long-chain protein backbones.
  • Figure 5: A comparison for various methods on their designability with the reduction of sampling steps. Original data is in Table \ref{['tab:SCOPe table']}.
  • ...and 4 more figures

Theorems & Definitions (8)

  • Theorem 3.1
  • Theorem 3.2
  • Corollary 3.3
  • Proposition 1.1
  • proof
  • proof
  • proof
  • proof