Arbitrary-steps Image Super-resolution via Diffusion Inversion
Zongsheng Yue, Kang Liao, Chen Change Loy
TL;DR
This paper proposes InvSR, a diffusion-inversion-based super-resolution framework that leverages a fixed pre-trained diffusion backbone together with a trainable noise predictor to invert a low-resolution image and generate a high-resolution output. A Partial Noise Prediction (PnP) strategy reduces inversion complexity by starting sampling at an intermediate timestep and compressing the noise maps to a small set, enabling arbitrary-step sampling from 1 to 5. Training optimizes a combination of $\\mathcal{L}_2$, LPIPS, and GAN losses to align recovered outputs with ground-truth HR images while maintaining perceptual quality, and experiments show InvSR achieves state-of-the-art or competitive performance with substantial efficiency gains, even in single-step setups. The method demonstrates strong performance across synthetic and real-world SR benchmarks and offers practical flexibility for adapting the sampling process to different degradation types, with potential for further speed-ups via model quantization and hardware optimization.
Abstract
This study presents a new image super-resolution (SR) technique based on diffusion inversion, aiming at harnessing the rich image priors encapsulated in large pre-trained diffusion models to improve SR performance. We design a Partial noise Prediction strategy to construct an intermediate state of the diffusion model, which serves as the starting sampling point. Central to our approach is a deep noise predictor to estimate the optimal noise maps for the forward diffusion process. Once trained, this noise predictor can be used to initialize the sampling process partially along the diffusion trajectory, generating the desirable high-resolution result. Compared to existing approaches, our method offers a flexible and efficient sampling mechanism that supports an arbitrary number of sampling steps, ranging from one to five. Even with a single sampling step, our method demonstrates superior or comparable performance to recent state-of-the-art approaches. The code and model are publicly available at https://github.com/zsyOAOA/InvSR.
