AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation
Rui Xie, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Jian Yang, Ying Tai
TL;DR
This work tackles the high computational cost of diffusion-prior blind super-resolution by introducing StableSR, a framework that freezes a pre-trained diffusion model and trains a lightweight time-aware encoder with SFT to condition the prior. It adds a controllable feature wrapping module to balance realism and fidelity and employs progressive aggregation sampling to handle arbitrary output resolutions. The method achieves superior perceptual restoration on real-world datasets while significantly reducing inference time, especially with SD-Turbo sampling. Overall, StableSR demonstrates that diffusion priors can be efficiently leveraged for high-quality blind SR without full-scale retraining, offering a practical path for diffusion-based restoration in real applications.
Abstract
Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. However, their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion distillation (ADD), we design~\name~to address this issue by incorporating the ideas of both distillation and ControlNet. Specifically, we first propose a prediction-based self-refinement strategy to provide high-frequency information in the student model output with marginal additional time cost. Furthermore, we refine the training process by employing HR images, rather than LR images, to regulate the teacher model, providing a more robust constraint for distillation. Second, we introduce a timestep-adaptive ADD to address the perception-distortion imbalance problem introduced by original ADD. Extensive experiments demonstrate our~\name~generates better restoration results, while achieving faster speed than previous SD-based state-of-the-art models (e.g., $7$$\times$ faster than SeeSR).
