POLARIS: Projection-Orthogonal Least Squares for Robust and Adaptive Inversion in Diffusion Models
Wenshuo Chen, Haosen Li, Shaofeng Liang, Lei Wang, Haozhe Jia, Kaishen Yuan, Jieming Wu, Bowen Tian, Yutao Yue
TL;DR
POLARIS identifies noise-prediction discrepancy during DDIM inversion as the root source of trajectory drift and proposes a per-step, dynamic guidance-scale update to minimize inversion error at its origin. By reformulating inversion as an error-origin optimization and deriving a robust closed-form update, POLARIS achieves substantial gains in reconstruction fidelity and editing robustness with minimal overhead, acting as a seamless plug-in for existing diffusion pipelines. Theoretical analysis confirms stability of the approximate solution, and experiments across reconstruction, editing, and scaling to larger models demonstrate broad applicability and practical impact for high-fidelity diffusion-based editing and restoration.
Abstract
The Inversion-Denoising Paradigm, which is based on diffusion models, excels in diverse image editing and restoration tasks. We revisit its mechanism and reveal a critical, overlooked factor in reconstruction degradation: the approximate noise error. This error stems from approximating the noise at step t with the prediction at step t-1, resulting in severe error accumulation throughout the inversion process. We introduce Projection-Orthogonal Least Squares for Robust and Adaptive Inversion (POLARIS), which reformulates inversion from an error-compensation problem into an error-origin problem. Rather than optimizing embeddings or latent codes to offset accumulated drift, POLARIS treats the guidance scale ω as a step-wise variable and derives a mathematically grounded formula to minimize inversion error at each step. Remarkably, POLARIS improves inversion latent quality with just one line of code. With negligible performance overhead, it substantially mitigates noise approximation errors and consistently improves the accuracy of downstream tasks.
