Table of Contents
Fetching ...

SteerFlow: Steering Rectified Flows for Faithful Inversion-Based Image Editing

Thinh Dao, Zhen Wang, Kien T. Pham, Long Chen

Abstract

Recent advances in flow-based generative models have enabled training-free, text-guided image editing by inverting an image into its latent noise and regenerating it under a new target conditional guidance. However, existing methods struggle to preserve source fidelity: higher-order solvers incur additional model inferences, truncated inversion constrains editability, and feature injection methods lack architectural transferability. To address these limitations, we propose SteerFlow, a model-agnostic editing framework with strong theoretical guarantees on source fidelity. In the forward process, we introduce an Amortized Fixed-Point Solver that implicitly straightens the forward trajectory by enforcing velocity consistency across consecutive timesteps, yielding a high-fidelity inverted latent. In the backward process, we introduce Trajectory Interpolation, which adaptively blends target-editing and source-reconstruction velocities to keep the editing trajectory anchored to the source. To further improve background preservation, we introduce an Adaptive Masking mechanism that spatially constrains the editing signal with concept-guided segmentation and source-target velocity differences. Extensive experiments on FLUX.1-dev and Stable Diffusion 3.5 Medium demonstrate that SteerFlow consistently achieves better editing quality than existing methods. Finally, we show that SteerFlow extends naturally to a complex multi-turn editing paradigm without accumulating drift.

SteerFlow: Steering Rectified Flows for Faithful Inversion-Based Image Editing

Abstract

Recent advances in flow-based generative models have enabled training-free, text-guided image editing by inverting an image into its latent noise and regenerating it under a new target conditional guidance. However, existing methods struggle to preserve source fidelity: higher-order solvers incur additional model inferences, truncated inversion constrains editability, and feature injection methods lack architectural transferability. To address these limitations, we propose SteerFlow, a model-agnostic editing framework with strong theoretical guarantees on source fidelity. In the forward process, we introduce an Amortized Fixed-Point Solver that implicitly straightens the forward trajectory by enforcing velocity consistency across consecutive timesteps, yielding a high-fidelity inverted latent. In the backward process, we introduce Trajectory Interpolation, which adaptively blends target-editing and source-reconstruction velocities to keep the editing trajectory anchored to the source. To further improve background preservation, we introduce an Adaptive Masking mechanism that spatially constrains the editing signal with concept-guided segmentation and source-target velocity differences. Extensive experiments on FLUX.1-dev and Stable Diffusion 3.5 Medium demonstrate that SteerFlow consistently achieves better editing quality than existing methods. Finally, we show that SteerFlow extends naturally to a complex multi-turn editing paradigm without accumulating drift.

Paper Structure

This paper contains 38 sections, 5 theorems, 45 equations, 19 figures, 7 tables, 4 algorithms.

Key Result

proposition 1

Let $v_\theta(z, t)$ be $L$-Lipschitz continuous in $z$ and have a bounded total time derivative $\|\frac{d v_\theta}{d t}\| \le M$, which represents the maximum curvature (acceleration) of the trajectory. Consider a discretization $\Delta t = 1/N$ and ignore the higher-order error $O(\Delta t^2)$. $\blacktriangleleft$$\blacktriangleleft$

Figures (19)

  • Figure 1: Comparison of SteerFlow and existing inversion-based methods. (a) Unconstrained generation causes over-editing with loss of source structure. (b) Prior approaches, represented by UniEdit jiao2026unieditflow, either rely on computationally expensive second-order solvers, or heuristic feature injection/truncated inversion that often suffers from under-editing (as shown by the unchanged source subject) and limited architectural transferability. (c) In contrast, SteerFlow presents a principled, architecture-agnostic editing framework that achieves high source-fidelity and faithful target alignment.
  • Figure 2: SteerFlow Editing Framework. Note that adaptive masking is optional.
  • Figure 3: An illustration of the AFP Solver with increasing iterations.
  • Figure 4: Comparison of masking strategies. No masking causes editing leakage, while using only the SAM3 mask restricts editability. SteerFlow Adaptive Mask dynamically expands the SAM3 mask to accommodate appropriate structural changes.
  • Figure 5: Qualitative comparison of editing baselines and SteerFlow for various editing tasks on FLUX.1-dev model.
  • ...and 14 more figures

Theorems & Definitions (9)

  • proposition 1: Euler Inversion Error Bound
  • proposition 2: Backward Editing Error Bound
  • proposition 3: SteerFlow Editing Error Bound
  • theorem 1: Banach Fixed-Point Theorem
  • proof
  • proposition 4: Convergence of The Update Map
  • proof
  • proof
  • proof