FreeInv: Free Lunch for Improving DDIM Inversion

Yuxiang Bao; Huijie Liu; Xun Gao; Huan Fu; Guoliang Kang

FreeInv: Free Lunch for Improving DDIM Inversion

Yuxiang Bao, Huijie Liu, Xun Gao, Huan Fu, Guoliang Kang

TL;DR

FreeInv targets the trajectory deviation in DDIM inversion by introducing a transformation-based latent augmentation that enables an ensemble of trajectories without the heavy cost. It formalizes a one-time Monte Carlo sampling per time-step combined with transformation-based branching to approximate multi-branch ensembles efficiently. The method is architecture-agnostic, compatible with both U-Net and DiT diffusion backbones, and improves reconstruction fidelity for images and videos while preserving editing capabilities and reducing computational burden. Empirical results on PIE and DAVIS demonstrate competitive or superior performance with substantially lower time and memory requirements, making FreeInv well-suited for video inversion and editing workflows.

Abstract

Naive DDIM inversion process usually suffers from a trajectory deviation issue, i.e., the latent trajectory during reconstruction deviates from the one during inversion. To alleviate this issue, previous methods either learn to mitigate the deviation or design cumbersome compensation strategy to reduce the mismatch error, exhibiting substantial time and computation cost. In this work, we present a nearly free-lunch method (named FreeInv) to address the issue more effectively and efficiently. In FreeInv, we randomly transform the latent representation and keep the transformation the same between the corresponding inversion and reconstruction time-step. It is motivated from a statistical perspective that an ensemble of DDIM inversion processes for multiple trajectories yields a smaller trajectory mismatch error on expectation. Moreover, through theoretical analysis and empirical study, we show that FreeInv performs an efficient ensemble of multiple trajectories. FreeInv can be freely integrated into existing inversion-based image and video editing techniques. Especially for inverting video sequences, it brings more significant fidelity and efficiency improvements. Comprehensive quantitative and qualitative evaluation on PIE benchmark and DAVIS dataset shows that FreeInv remarkably outperforms conventional DDIM inversion, and is competitive among previous state-of-the-art inversion methods, with superior computation efficiency.

FreeInv: Free Lunch for Improving DDIM Inversion

TL;DR

Abstract

FreeInv: Free Lunch for Improving DDIM Inversion

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)