Table of Contents
Fetching ...

Improving Rectified Flow with Boundary Conditions

Xixi Hu, Runlong Liao, Keyang Xu, Bo Liu, Yeqing Li, Eugene Ie, Hongliang Fei, Qiang Liu

TL;DR

Rectified Flow learns a velocity field to transport noise to data via an ODE, but vanilla RF often violates theoretical boundary conditions, causing instability in sampling near the terminal time. The authors introduce Boundary-enforced Rectified Flow Models with two parameterizations—Mask-based and Subtraction-based—that enforce $v(\mathbf{x},1)=\mathbf{x}$ (and optionally $v(\mathbf{x},0)=C-\mathbf{x}$) by design, with minimal code changes. They demonstrate substantial gains on ImageNet and CIFAR-10 across both deterministic (Euler) and stochastic (SDE) sampling, and ablations confirm the importance of boundary choices and scalability to larger models and higher resolutions. By stabilizing the score function near $t=1$, the approach enables robust stochastic sampling and suggests applicability to broader diffusion-flow hybrids. Overall, Boundary RF Model provides a simple, effective, and scalable fix for boundary violations in Rectified Flow with meaningful practical impact.

Abstract

Rectified Flow offers a simple and effective approach to high-quality generative modeling by learning a velocity field. However, we identify a limitation in directly modeling the velocity with an unconstrained neural network: the learned velocity often fails to satisfy certain boundary conditions, leading to inaccurate velocity field estimations that deviate from the desired ODE. This issue is particularly critical during stochastic sampling at inference, as the score function's errors are amplified near the boundary. To mitigate this, we propose a Boundary-enforced Rectified Flow Model (Boundary RF Model), in which we enforce boundary conditions with a minimal code modification. Boundary RF Model improves performance over vanilla RF model, demonstrating 8.01% improvement in FID score on ImageNet using ODE sampling and 8.98% improvement using SDE sampling.

Improving Rectified Flow with Boundary Conditions

TL;DR

Rectified Flow learns a velocity field to transport noise to data via an ODE, but vanilla RF often violates theoretical boundary conditions, causing instability in sampling near the terminal time. The authors introduce Boundary-enforced Rectified Flow Models with two parameterizations—Mask-based and Subtraction-based—that enforce (and optionally ) by design, with minimal code changes. They demonstrate substantial gains on ImageNet and CIFAR-10 across both deterministic (Euler) and stochastic (SDE) sampling, and ablations confirm the importance of boundary choices and scalability to larger models and higher resolutions. By stabilizing the score function near , the approach enables robust stochastic sampling and suggests applicability to broader diffusion-flow hybrids. Overall, Boundary RF Model provides a simple, effective, and scalable fix for boundary violations in Rectified Flow with meaningful practical impact.

Abstract

Rectified Flow offers a simple and effective approach to high-quality generative modeling by learning a velocity field. However, we identify a limitation in directly modeling the velocity with an unconstrained neural network: the learned velocity often fails to satisfy certain boundary conditions, leading to inaccurate velocity field estimations that deviate from the desired ODE. This issue is particularly critical during stochastic sampling at inference, as the score function's errors are amplified near the boundary. To mitigate this, we propose a Boundary-enforced Rectified Flow Model (Boundary RF Model), in which we enforce boundary conditions with a minimal code modification. Boundary RF Model improves performance over vanilla RF model, demonstrating 8.01% improvement in FID score on ImageNet using ODE sampling and 8.98% improvement using SDE sampling.

Paper Structure

This paper contains 36 sections, 13 equations, 10 figures, 5 tables, 2 algorithms.

Figures (10)

  • Figure 1: Boundary condition violation in Rectified Flow: predicted vs. expected velocity. Vanilla Rectified Flow model learns a velocity field $v(\textbf{x}, t)$ to transform noise (left, $t=0$) into data (right, $t=1$). Ideally, this learned velocity field should satisfy defined boundary condition (top). However, as visualized, the predicted velocity at $t=1$ (bottom) deviates from the expected data distribution and violate the right boundary condition $v(\textbf{x}, t) = \textbf{x}$. This highlights a critical practical limitation of vanilla RF model. Please refer to Appendix \ref{['sec::flux']} for more examples.
  • Figure 2: Toy example: Boundary RF Model stabilizes stochastic sampling and score function. We visualize the behavior of vanilla RF model and Boundary RF Model when learning to map from noise $\pi_0$ to data $\pi_1$. From left to right, we visualize the followings: 1) Euler Sampling (Deterministic): Both models learn effective ODE trajectories, generating similar samples via deterministic Euler sampler. 2) Stochastic Sampling: vanilla RF model produces more concentrated samples due to score function instability. In contrast, Boundary RF Model, by enforcing boundary conditions, generates samples that retain the shape of the target distribution. 3) Velocity at $t=1$ Boundary: vanilla RF model velocities deviate from the data, violating ${\bm{v}}(\mathbf{x}, 1) = \mathbf{x}$, while Boundary RF Model adheres to this boundary condition. 4) Score Function: We visualize the score function computed using Eq. \ref{['equ:tweedie']}. Visualization of the score function near $t=1$ shows that vanilla RF model exhibits an unstable and divergent score field. In contrast, Boundary RF Model demonstrates a stable and well-behaved score function, preventing the unboundedness that leads to concentration effect in stochastic sampling.
  • Figure 3: Performance comparison of Boundary RF Model and vanilla RF model v.s. sampling steps (50, 100 and 200 steps) on ImageNet dataset. Boundary RF Model consistently outperforms vanilla RF model across varying numbers of sampling steps, exhibiting a more substantial performance gain at higher step counts.
  • Figure 4: Qualitative comparison of image generation results on ImageNet $\mathbf{256 \times 256}$ dataset. We present paired examples generated by vanilla RF model, Mask-based Boundary RF Model and Subtraction-based Boundary RF Model. We use the same random seed during training and evaluation for all models to ensure a fair visual comparison. Our approaches consistently generate images with better structures and improved visual fidelity compared to vanilla RF model. Additional visual examples are provided in Appendix \ref{['sec::additional-examples']}.
  • Figure 5: Ablation study on boundary functions. Quantitative comparison of different choices for functions (f(t), g(t), h(t)) in the double-boundary model. Metrics are evaluated on ImageNet $256\times256$ dataset.
  • ...and 5 more figures