Table of Contents
Fetching ...

DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion

Chunyang Bi, Xin Luo, Sheng Shen, Mengxi Zhang, Huanjing Yue, Jingyu Yang

TL;DR

DeeDSR addresses real-world image SR by integrating degradation-aware priors into a diffusion framework. It first learns degradation semantics through contrastive learning and then guides a frozen Stable Diffusion model with a Degradation-Aware Adapter that fuses global, image-driven degradation prompts with LR content via cross-attention and modulation layers, using Noise Guidance to balance realism and fidelity. The approach achieves state-of-the-art performance on synthetic and real-world benchmarks, with strong semantic preservation and competitive perceptual quality, validated by quantitative metrics and user studies. This work demonstrates that image-driven degradation representations can substantially enhance the semantic fidelity of diffusion-based SR, offering a practical path to robust real-world SR applications.

Abstract

Diffusion models, known for their powerful generative capabilities, play a crucial role in addressing real-world super-resolution challenges. However, these models often focus on improving local textures while neglecting the impacts of global degradation, which can significantly reduce semantic fidelity and lead to inaccurate reconstructions and suboptimal super-resolution performance. To address this issue, we introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images. In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations. In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations based on the learned representations. Furthermore, we decompose the degradation-aware features into global semantics and local details branches, which are then injected into the diffusion denoising module to modulate the target generation. Our method effectively recovers semantically precise and photorealistic details, particularly under significant degradation conditions, demonstrating state-of-the-art performance across various benchmarks. Codes will be released at https://github.com/bichunyang419/DeeDSR.

DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion

TL;DR

DeeDSR addresses real-world image SR by integrating degradation-aware priors into a diffusion framework. It first learns degradation semantics through contrastive learning and then guides a frozen Stable Diffusion model with a Degradation-Aware Adapter that fuses global, image-driven degradation prompts with LR content via cross-attention and modulation layers, using Noise Guidance to balance realism and fidelity. The approach achieves state-of-the-art performance on synthetic and real-world benchmarks, with strong semantic preservation and competitive perceptual quality, validated by quantitative metrics and user studies. This work demonstrates that image-driven degradation representations can substantially enhance the semantic fidelity of diffusion-based SR, offering a practical path to robust real-world SR applications.

Abstract

Diffusion models, known for their powerful generative capabilities, play a crucial role in addressing real-world super-resolution challenges. However, these models often focus on improving local textures while neglecting the impacts of global degradation, which can significantly reduce semantic fidelity and lead to inaccurate reconstructions and suboptimal super-resolution performance. To address this issue, we introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images. In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations. In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations based on the learned representations. Furthermore, we decompose the degradation-aware features into global semantics and local details branches, which are then injected into the diffusion denoising module to modulate the target generation. Our method effectively recovers semantically precise and photorealistic details, particularly under significant degradation conditions, demonstrating state-of-the-art performance across various benchmarks. Codes will be released at https://github.com/bichunyang419/DeeDSR.
Paper Structure (19 sections, 9 equations, 11 figures, 7 tables)

This paper contains 19 sections, 9 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Qualitative comparison of our DeeDSR and StableSRwang2023stablesr under various degradation conditions on the synthetic DIV2K dataset. Our model demonstrates robustness against different degradations, generating correct semantics and textures, while StableSR produces incorrect semantics and textures under severe degradation. Degradation levels include light, medium, and heavy, detailed in \ref{['sec:ablation']} and \ref{['fig:degradation']}
  • Figure 2: Pipeline of our proposed DeeDSR. In the first stage, we capture global degradation semantics using a contrastive learning strategy. In the second stage, the degradation information is integrated with LR images via degradation-aware (DA) blocks to precisely control the pre-trained Stable Diffusion (SD) model through Cross Attention Modules and Modulation Layers. Note that the Degradation Learner is fixed in the second stage.
  • Figure 3: Qualitative comparisons of different Real-ISR methods on synthetic dataset. Please zoom in for a better view.
  • Figure 4: Qualitative comparisons of different methods on real-world datasets. Please zoom in for a better view.
  • Figure 5: Results of user study on synthetic and real-world data.
  • ...and 6 more figures