Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model
Leheng Zhang, Weiyi You, Kexuan Shi, Shuhang Gu
TL;DR
This paper tackles real-world single-image super-resolution by reframing diffusion-based SR as an LR-content-aware process. It introduces Uncertainty-guided Noise Weighting (UNW) to apply region-specific noise based on an uncertainty estimate derived from an auxiliary SR network, and couples this with a lighter pixel-space diffusion architecture (PixelUnshuffle + upsampling) and SR conditioning to improve both fidelity and perceptual quality while reducing model size and training overhead. The approach achieves state-of-the-art perceptual performance on synthetic and real-world SR benchmarks, with substantial efficiency gains (e.g., ~30% smaller model and ~167% faster training) and robust qualitative improvements (sharper textures and edges). The work demonstrates the practical viability of region-aware diffusion SR for real-world deployment and provides detailed supplementary material on sampling, weighting, and architecture choices.
Abstract
Diffusion-based image super-resolution methods have demonstrated significant advantages over GAN-based approaches, particularly in terms of perceptual quality. Building upon a lengthy Markov chain, diffusion-based methods possess remarkable modeling capacity, enabling them to achieve outstanding performance in real-world scenarios. Unlike previous methods that focus on modifying the noise schedule or sampling process to enhance performance, our approach emphasizes the improved utilization of LR information. We find that different regions of the LR image can be viewed as corresponding to different timesteps in a diffusion process, where flat areas are closer to the target HR distribution but edge and texture regions are farther away. In these flat areas, applying a slight noise is more advantageous for the reconstruction. We associate this characteristic with uncertainty and propose to apply uncertainty estimate to guide region-specific noise level control, a technique we refer to as Uncertainty-guided Noise Weighting. Pixels with lower uncertainty (i.e., flat regions) receive reduced noise to preserve more LR information, therefore improving performance. Furthermore, we modify the network architecture of previous methods to develop our Uncertainty-guided Perturbation Super-Resolution (UPSR) model. Extensive experimental results demonstrate that, despite reduced model size and training overhead, the proposed UWSR method outperforms current state-of-the-art methods across various datasets, both quantitatively and qualitatively.
