Table of Contents
Fetching ...

FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing

Yingying Deng, Xiangyu He, Changwang Mei, Peisong Wang, Fan Tang

TL;DR

FireFlow tackles fast, accurate inversion and editing for Rectified Flow (ReFlow) models by introducing a low-cost, second-order-accurate ODE solver that reuses velocity estimates to match the accuracy of higher-order schemes at the cost of an Euler step. This training-free approach leverages the near-constant velocity dynamics of well-trained ReFlow models to enable 8-step inversion and editing with improved reconstruction fidelity and faster performance. Empirical results show up to ~2.7× speedups and substantial error reductions in inversion and reconstruction, as well as competitive or superior results in text-guided generation and semantic editing (e.g., PIE-Bench), without needing auxiliary editing models. The method balances accuracy and efficiency, offering scalable, real-time capable inversion for ReFlow-based generation like FLUX and broader editing tasks, while noting limitations in color edits and proposing enhancements via attention feature integration. $v_\theta$ dynamics, step size $\Delta t$, and the modified midpoint updates underpin the core contributions and practical impact of FireFlow.$

Abstract

Though Rectified Flows (ReFlows) with distillation offers a promising way for fast sampling, its fast inversion transforms images back to structured noise for recovery and following editing remains unsolved. This paper introduces FireFlow, a simple yet effective zero-shot approach that inherits the startling capacity of ReFlow-based models (such as FLUX) in generation while extending its capabilities to accurate inversion and editing in $8$ steps. We first demonstrate that a carefully designed numerical solver is pivotal for ReFlow inversion, enabling accurate inversion and reconstruction with the precision of a second-order solver while maintaining the practical efficiency of a first-order Euler method. This solver achieves a $3\times$ runtime speedup compared to state-of-the-art ReFlow inversion and editing techniques, while delivering smaller reconstruction errors and superior editing results in a training-free mode. The code is available at $\href{https://github.com/HolmesShuan/FireFlow}{this URL}$.

FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing

TL;DR

FireFlow tackles fast, accurate inversion and editing for Rectified Flow (ReFlow) models by introducing a low-cost, second-order-accurate ODE solver that reuses velocity estimates to match the accuracy of higher-order schemes at the cost of an Euler step. This training-free approach leverages the near-constant velocity dynamics of well-trained ReFlow models to enable 8-step inversion and editing with improved reconstruction fidelity and faster performance. Empirical results show up to ~2.7× speedups and substantial error reductions in inversion and reconstruction, as well as competitive or superior results in text-guided generation and semantic editing (e.g., PIE-Bench), without needing auxiliary editing models. The method balances accuracy and efficiency, offering scalable, real-time capable inversion for ReFlow-based generation like FLUX and broader editing tasks, while noting limitations in color edits and proposing enhancements via attention feature integration. dynamics, step size , and the modified midpoint updates underpin the core contributions and practical impact of FireFlow.$

Abstract

Though Rectified Flows (ReFlows) with distillation offers a promising way for fast sampling, its fast inversion transforms images back to structured noise for recovery and following editing remains unsolved. This paper introduces FireFlow, a simple yet effective zero-shot approach that inherits the startling capacity of ReFlow-based models (such as FLUX) in generation while extending its capabilities to accurate inversion and editing in steps. We first demonstrate that a carefully designed numerical solver is pivotal for ReFlow inversion, enabling accurate inversion and reconstruction with the precision of a second-order solver while maintaining the practical efficiency of a first-order Euler method. This solver achieves a runtime speedup compared to state-of-the-art ReFlow inversion and editing techniques, while delivering smaller reconstruction errors and superior editing results in a training-free mode. The code is available at .

Paper Structure

This paper contains 22 sections, 3 theorems, 51 equations, 9 figures, 6 tables, 2 algorithms.

Key Result

Proposition 3.1

Given a $p$-th order ODE solver and the ODE $\frac{dX_t}{dt}=v_\theta(X_t,t)$, if the dynamics of the reverse pass satisfy $\frac{\mathrm{d}X_t}{\mathrm{d}t} = -v_\theta(X_t, t)$ which is Lipschitz continuous with constant $L$. The perturbation $\Delta_T$ at $t = T$ propagates backward to $t = 0$. T

Figures (9)

  • Figure 1: Results on 2D synthetic dataset. We evaluate the performance of 2-Rectified Flow using the Euler solver, midpoint solver, and our proposed approach on a 2D synthetic dataset. The source distribution $\pi_0$ (orange) and the target distribution $\pi_1$ (green) are parameterized as Gaussian mixture models. For the Euler method, the number of sampling steps is set to $N = 20$, corresponding to an NFE of 20. Our approach generates samples that align more closely with the target distribution, achieving a better match in density and structure. Additionally, the trajectories of the samples exhibit greater straightness, adhering closely to the ideal of linear motion.
  • Figure 2: Illustrations of the approximation error in velocity ($\|\hat{v}_\theta - v_\theta\|$) as it evolves with inversion steps (left subfigures) and denoising steps (right subfigures), with $\Delta t$ included as a reference.
  • Figure 3: Image reconstruction errors versus denoising NFE: Our approach, compared to the first-order vanilla ReFlow inversion and second-order RF-solver, achieves lower reconstruction errors and demonstrates faster convergence with respect to NFE.
  • Figure 4: Qualitative results of image reconstruction. Our approach achieves faster convergence and superior reconstruction quality compared to baseline ReFlow methods utilizing the FLUX model. Difference images showing the pixel-wise variations between the source image and the reconstructed images are also provided.
  • Figure 5: Comparison with State-of-the-art editing methods.
  • ...and 4 more figures

Theorems & Definitions (6)

  • Proposition 3.1
  • Proposition 4.1
  • Theorem 4.2
  • proof
  • proof
  • proof