Table of Contents
Fetching ...

POLARIS: Projection-Orthogonal Least Squares for Robust and Adaptive Inversion in Diffusion Models

Wenshuo Chen, Haosen Li, Shaofeng Liang, Lei Wang, Haozhe Jia, Kaishen Yuan, Jieming Wu, Bowen Tian, Yutao Yue

TL;DR

POLARIS identifies noise-prediction discrepancy during DDIM inversion as the root source of trajectory drift and proposes a per-step, dynamic guidance-scale update to minimize inversion error at its origin. By reformulating inversion as an error-origin optimization and deriving a robust closed-form update, POLARIS achieves substantial gains in reconstruction fidelity and editing robustness with minimal overhead, acting as a seamless plug-in for existing diffusion pipelines. Theoretical analysis confirms stability of the approximate solution, and experiments across reconstruction, editing, and scaling to larger models demonstrate broad applicability and practical impact for high-fidelity diffusion-based editing and restoration.

Abstract

The Inversion-Denoising Paradigm, which is based on diffusion models, excels in diverse image editing and restoration tasks. We revisit its mechanism and reveal a critical, overlooked factor in reconstruction degradation: the approximate noise error. This error stems from approximating the noise at step t with the prediction at step t-1, resulting in severe error accumulation throughout the inversion process. We introduce Projection-Orthogonal Least Squares for Robust and Adaptive Inversion (POLARIS), which reformulates inversion from an error-compensation problem into an error-origin problem. Rather than optimizing embeddings or latent codes to offset accumulated drift, POLARIS treats the guidance scale ω as a step-wise variable and derives a mathematically grounded formula to minimize inversion error at each step. Remarkably, POLARIS improves inversion latent quality with just one line of code. With negligible performance overhead, it substantially mitigates noise approximation errors and consistently improves the accuracy of downstream tasks.

POLARIS: Projection-Orthogonal Least Squares for Robust and Adaptive Inversion in Diffusion Models

TL;DR

POLARIS identifies noise-prediction discrepancy during DDIM inversion as the root source of trajectory drift and proposes a per-step, dynamic guidance-scale update to minimize inversion error at its origin. By reformulating inversion as an error-origin optimization and deriving a robust closed-form update, POLARIS achieves substantial gains in reconstruction fidelity and editing robustness with minimal overhead, acting as a seamless plug-in for existing diffusion pipelines. Theoretical analysis confirms stability of the approximate solution, and experiments across reconstruction, editing, and scaling to larger models demonstrate broad applicability and practical impact for high-fidelity diffusion-based editing and restoration.

Abstract

The Inversion-Denoising Paradigm, which is based on diffusion models, excels in diverse image editing and restoration tasks. We revisit its mechanism and reveal a critical, overlooked factor in reconstruction degradation: the approximate noise error. This error stems from approximating the noise at step t with the prediction at step t-1, resulting in severe error accumulation throughout the inversion process. We introduce Projection-Orthogonal Least Squares for Robust and Adaptive Inversion (POLARIS), which reformulates inversion from an error-compensation problem into an error-origin problem. Rather than optimizing embeddings or latent codes to offset accumulated drift, POLARIS treats the guidance scale ω as a step-wise variable and derives a mathematically grounded formula to minimize inversion error at each step. Remarkably, POLARIS improves inversion latent quality with just one line of code. With negligible performance overhead, it substantially mitigates noise approximation errors and consistently improves the accuracy of downstream tasks.

Paper Structure

This paper contains 31 sections, 3 theorems, 42 equations, 12 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

The computation of $\Delta\omega$ is ill-posed, as small prediction noise $\delta$ can cause its error to diverge when the historical guidance direction $\|\mathbf{b}\|$ approaches zero.

Figures (12)

  • Figure 1: (A) Existing CFG-based DDIM methods introduce and accumulate errors at each step of the inversion process, eventually leading to the distribution shift of downstream tasks. (B) Our method actively seeks a mathematically invertible path to obtain a latent variable that is closer to the ideal one. During generation, the recorded $\omega_t$ sequence is replayed, enabling high-fidelity reconstruction. (C) Shows our optimization goal from a geometric point of view to minimize the inversion error.
  • Figure 2: Empirical validation of the negligibility of the history-dependent term. It compares the average magnitude of the current-step error term $\|a\|$ (red line) against the history-dependent term $\|\Delta\omega\| \cdot \|b\|$ (yellow line). The squared ratio of their magnitudes (blue line) is shown to be significantly greater than the reference line across the vast majority of timesteps. It validates our core hypothesis that the history term $\mathbf{b}\Delta\omega$ is numerically negligible, justifying our approximation in Eq. (\ref{['eq:approx']}).
  • Figure 3: Instability of the "Exact Solution" and resulting reconstruction collapse. The guidance scale $\omega$ computed by this solution (green line) exhibits extreme fluctuations and instability. It leads to a total collapse in the reconstruction (right), which is filled with artifacts and distortion compared to the original image (left). This phenomenon confirms the solution's "practical fragility" and strongly motivates our search for a more robust approximation.
  • Figure 4: Qualitative comparison of POLARIS on complex image editing tasks. Compared to the baseline approach, our POLARIS demonstrates superior editing fidelity and the ability to follow complex instructions.
  • Figure 5: Sensitivity Analysis of the Initial Guidance Scale $\omega_0$, demonstrating the robustness of POLARIS.
  • ...and 7 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • proof
  • proof
  • Theorem 3
  • proof