Table of Contents
Fetching ...

Adams Bashforth Moulton Solver for Inversion and Editing in Rectified Flow

Yongjia Ma, Donglin Di, Xuan Liu, Xiaokai Chen, Lei Fan, Tonghua Su, Yue Gao

TL;DR

This work introduces ABM Solver, which integrates a multi step predictor corrector approach to reduce local truncation errors and employs Adaptive Step Size Adjustment to improve sampling speed and introduces a Mask Guided Feature Injection module to effectively preserve non edited regions while facilitating semantic modifications.

Abstract

Rectified flow models have achieved remarkable performance in image and video generation tasks. However, existing numerical solvers face a trade-off between fast sampling and high accuracy solutions, limiting their effectiveness in downstream applications such as reconstruction and editing. To address this challenge, we propose leveraging the Adams Bashforth Moulton (ABM) predictor corrector method to enhance the accuracy of ODE solving in rectified flow models. Specifically, we introduce ABM Solver, which integrates a multi step predictor corrector approach to reduce local truncation errors and employs Adaptive Step Size Adjustment to improve sampling speed. Furthermore, to effectively preserve non edited regions while facilitating semantic modifications, we introduce a Mask Guided Feature Injection module. We estimate self-similarity to generate a spatial mask that differentiates preserved regions from those available for editing. Extensive experiments on multiple high resolution image datasets validate that ABM Solver significantly improves inversion precision and editing quality, outperforming existing solvers without requiring additional training or optimization.

Adams Bashforth Moulton Solver for Inversion and Editing in Rectified Flow

TL;DR

This work introduces ABM Solver, which integrates a multi step predictor corrector approach to reduce local truncation errors and employs Adaptive Step Size Adjustment to improve sampling speed and introduces a Mask Guided Feature Injection module to effectively preserve non edited regions while facilitating semantic modifications.

Abstract

Rectified flow models have achieved remarkable performance in image and video generation tasks. However, existing numerical solvers face a trade-off between fast sampling and high accuracy solutions, limiting their effectiveness in downstream applications such as reconstruction and editing. To address this challenge, we propose leveraging the Adams Bashforth Moulton (ABM) predictor corrector method to enhance the accuracy of ODE solving in rectified flow models. Specifically, we introduce ABM Solver, which integrates a multi step predictor corrector approach to reduce local truncation errors and employs Adaptive Step Size Adjustment to improve sampling speed. Furthermore, to effectively preserve non edited regions while facilitating semantic modifications, we introduce a Mask Guided Feature Injection module. We estimate self-similarity to generate a spatial mask that differentiates preserved regions from those available for editing. Extensive experiments on multiple high resolution image datasets validate that ABM Solver significantly improves inversion precision and editing quality, outperforming existing solvers without requiring additional training or optimization.

Paper Structure

This paper contains 29 sections, 24 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of Vanilla ODE Solver and Our ABM-Solver. (a) Vanilla ODE Solver: Conventional numerical integration accumulates approximation errors, compromising inversion accuracy and editing fidelity. (b) Our ABM-Solver: By integrating a second-order ABM predictor–corrector framework, the Adaptive Step Size Adjustment, and Mask Guided Feature Injection, our method significantly reduces errors and achieves superior inversion and semantic editing performance.
  • Figure 2: Overview of our Mask Guided Feature Injection module. Given the attention values from the inversion process $\widetilde{V}_{t_i}$ and the sampling process $V_{t_i}$, we first compute a pixel-wise cosine similarity map, which is then thresholded to generate a binary mask $\mathbf{M}$. This mask identifies regions of high structural consistency. It then guides the fusion of attention values for the subsequent timestep $t_{i-1}$, injecting features from the inversion process $\widetilde{V}_{t_{i-1}}$ into the preserved regions. This ensures that the structural integrity of the source is maintained while allowing for targeted semantic edits.
  • Figure 3: Qualitative comparison of image editing results across different methods. Our ABM-Solver demonstrates superior performance, delivering better results in both content preservation and the accuracy of applied edits, particularly in more complex tasks such as object substitution, background editing, and style transfer.
  • Figure 4: Qualitative results of image reconstruction with our ABM-Solver, FireFlow, RF-Solver, and ReFlow-Inv. The results demonstrate that ABM-Solver provides superior structural consistency and detail preservation compared to other methods.
  • Figure 5: Ablation Results of Mask Guided Feature Injection module in our ABM solver.
  • ...and 4 more figures

Theorems & Definitions (2)

  • proof : Sketch of the Local and Global Error Bounds
  • proof