Table of Contents
Fetching ...

Diffusion Restoration Adapter for Real-World Image Restoration

Hanbang Liang, Zhen Wang, Weihui Deng

TL;DR

This work addresses real-world image restoration by leveraging pretrained diffusion priors while avoiding the heavy conditioning networks typical of ControlNet. It introduces the Diffusion Restoration Adapter, comprising Restoration Adapters integrated into the denoising blocks and Diffusion Adapters based on LoRA to fine-tune select parameters, compatible with both UNet-based SDXL and DiT-based SD3 priors. A Restoration Sampling Strategy guides denoising toward fidelity to the low-quality input, balancing visual fidelity and diversity. Empirical results on RealPhoto60 and DIV2K show competitive quality with far fewer trainable parameters than comparison methods, demonstrating effective, controllable restoration in a parameter-efficient framework.

Abstract

Diffusion models have demonstrated their powerful image generation capabilities, effectively fitting highly complex image distributions. These models can serve as strong priors for image restoration. Existing methods often utilize techniques like ControlNet to sample high quality images with low quality images from these priors. However, ControlNet typically involves copying a large part of the original network, resulting in a significantly large number of parameters as the prior scales up. In this paper, we propose a relatively lightweight Adapter that leverages the powerful generative capabilities of pretrained priors to achieve photo-realistic image restoration. The Adapters can be adapt to both denoising UNet and DiT, and performs excellent.

Diffusion Restoration Adapter for Real-World Image Restoration

TL;DR

This work addresses real-world image restoration by leveraging pretrained diffusion priors while avoiding the heavy conditioning networks typical of ControlNet. It introduces the Diffusion Restoration Adapter, comprising Restoration Adapters integrated into the denoising blocks and Diffusion Adapters based on LoRA to fine-tune select parameters, compatible with both UNet-based SDXL and DiT-based SD3 priors. A Restoration Sampling Strategy guides denoising toward fidelity to the low-quality input, balancing visual fidelity and diversity. Empirical results on RealPhoto60 and DIV2K show competitive quality with far fewer trainable parameters than comparison methods, demonstrating effective, controllable restoration in a parameter-efficient framework.

Abstract

Diffusion models have demonstrated their powerful image generation capabilities, effectively fitting highly complex image distributions. These models can serve as strong priors for image restoration. Existing methods often utilize techniques like ControlNet to sample high quality images with low quality images from these priors. However, ControlNet typically involves copying a large part of the original network, resulting in a significantly large number of parameters as the prior scales up. In this paper, we propose a relatively lightweight Adapter that leverages the powerful generative capabilities of pretrained priors to achieve photo-realistic image restoration. The Adapters can be adapt to both denoising UNet and DiT, and performs excellent.

Paper Structure

This paper contains 27 sections, 2 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The proposed Diffusion Restoration Adapter delivers high-quality image restoration. Our proposed methods can utilize descriptive prompts to achieve controllable restoration, as shown in (b). Our methods show competitive on qualitative comparisons with other SOTA methods.
  • Figure 2: The architecture of the Diffusion Restoration Adapter is integrated within the denoising network. Restoration Adapters are inserted into the network blocks of the original structure. Additionally, Diffusion Adapters are applied to specific parameters, particularly integrating LoRA into the self-attention module within each block.
  • Figure 3: The Restoration Adapter has two variants, specifically designed for the denoising UNet and DiTs. Linear denotes fully-connected layer.
  • Figure 4: Qualitative comparisons on RealPhoto60 and center-cropped DIV2K validation set. Our methods can perform good results under different kinds of degradations. (Zoom in for details)
  • Figure 5: Qualitative comparisons on randomly cropped images from DIV2K validation set. From top to bottom, every three images represent the results of the corresponding methods.
  • ...and 1 more figures