Table of Contents
Fetching ...

Sequential Flow Straightening for Generative Modeling

Jongmin Yoon, Juho Lee

TL;DR

SeqRF addresses slow sampling in continuous-time generative models by time-segmenting the ODE trajectory and training on joint distributions to straighten the probability flow, reducing global truncation error and enabling faster sampling with improved synthesis. The method combines theoretical bounds on ODE truncation error with a practical training objective over segmented time intervals, and distillation further accelerates inference. Empirical results on CIFAR-10, CelebA-64, and LSUN-Church demonstrate state-of-the-art or competitive FID/KID scores with few function evaluations, outperforming prior flow-matching and diffusion-based approaches. The work advances efficient, high-quality generative modeling and hints at broad applicability beyond image generation, albeit with ethical considerations around synthetic media.

Abstract

Straightening the probability flow of the continuous-time generative models, such as diffusion models or flow-based models, is the key to fast sampling through the numerical solvers, existing methods learn a linear path by directly generating the probability path the joint distribution between the noise and data distribution. One key reason for the slow sampling speed of the ODE-based solvers that simulate these generative models is the global truncation error of the ODE solver, caused by the high curvature of the ODE trajectory, which explodes the truncation error of the numerical solvers in the low-NFE regime. To address this challenge, We propose a novel method called SeqRF, a learning technique that straightens the probability flow to reduce the global truncation error and hence enable acceleration of sampling and improve the synthesis quality. In both theoretical and empirical studies, we first observe the straightening property of our SeqRF. Through empirical evaluations via SeqRF over flow-based generative models, We achieve surpassing results on CIFAR-10, CelebA-$64 \times 64$, and LSUN-Church datasets.

Sequential Flow Straightening for Generative Modeling

TL;DR

SeqRF addresses slow sampling in continuous-time generative models by time-segmenting the ODE trajectory and training on joint distributions to straighten the probability flow, reducing global truncation error and enabling faster sampling with improved synthesis. The method combines theoretical bounds on ODE truncation error with a practical training objective over segmented time intervals, and distillation further accelerates inference. Empirical results on CIFAR-10, CelebA-64, and LSUN-Church demonstrate state-of-the-art or competitive FID/KID scores with few function evaluations, outperforming prior flow-matching and diffusion-based approaches. The work advances efficient, high-quality generative modeling and hints at broad applicability beyond image generation, albeit with ethical considerations around synthetic media.

Abstract

Straightening the probability flow of the continuous-time generative models, such as diffusion models or flow-based models, is the key to fast sampling through the numerical solvers, existing methods learn a linear path by directly generating the probability path the joint distribution between the noise and data distribution. One key reason for the slow sampling speed of the ODE-based solvers that simulate these generative models is the global truncation error of the ODE solver, caused by the high curvature of the ODE trajectory, which explodes the truncation error of the numerical solvers in the low-NFE regime. To address this challenge, We propose a novel method called SeqRF, a learning technique that straightens the probability flow to reduce the global truncation error and hence enable acceleration of sampling and improve the synthesis quality. In both theoretical and empirical studies, we first observe the straightening property of our SeqRF. Through empirical evaluations via SeqRF over flow-based generative models, We achieve surpassing results on CIFAR-10, CelebA-, and LSUN-Church datasets.
Paper Structure (51 sections, 6 theorems, 46 equations, 13 figures, 4 tables)

This paper contains 51 sections, 6 theorems, 46 equations, 13 figures, 4 tables.

Key Result

Proposition 1

The gradient of eq:fm_objective and eq:cfm_objective is equal. That is, ${\mathcal{L}}_\mathrm{FM}(\theta)={\mathcal{L}}_\mathrm{CFM}(\theta) + C$ for a constant $C$. The detailed proof is described in app:fm_cfm.

Figures (13)

  • Figure 1: The overall generation result in CIFAR-10 dataset, comparing our method to existing diffusion and flow-based model solvers. The black starred points stand for our proposed SeqRF method.
  • Figure 2: The concept figure of our method. The red, yellow and blue triangles represent the truncation error being accumulated in the corresponding time. Compared to the red reflow method, sequential reflow (SeqRF) mitigates marginal truncation error by running time-segmented ODE.
  • Figure 3: The global truncation error over in CIFAR-10 dataset, compared to the oracle Euler-480 step solver. Our SeqRF methods, shown in blue and red lines, deploy lower global truncation errors.
  • Figure 4: The concept figure of $k$-SeqRF-Distill, contrast to the RF-Distill method liu2023rf.
  • Figure 5: Generation performance of Sequential reflow, compared to the original rectified flow method. The black and red, and other lines represent the rectified flow (1-RF), the reflowed model (2-RF), and the $k$-SeqRF ($k=\{1, 2, 4, 6, 8, 12\}$ models, respectively. The starred points denote the performance of distilled models.
  • ...and 8 more figures

Theorems & Definitions (14)

  • Proposition 1: Equivalence of the FM and CFM objective
  • Definition 1: truncation errors
  • Theorem 2: Upper Bound of GTE w.r.t. LTE
  • proof
  • Lemma 3: Dahlquist Equivalence Theorem dahlquist1963
  • Theorem 4: Global Truncation Error with Increasing Time Segments
  • proof
  • Proposition 5: pooladian2023multisample, Lemma 3.2
  • Proposition 6: Equivalence of \ref{['eq:fm_objective']} and \ref{['eq:cfm_objective']}
  • proof
  • ...and 4 more