Table of Contents
Fetching ...

Efficient Image Restoration through Low-Rank Adaptation and Stable Diffusion XL

Haiyang Zhao

TL;DR

The paper tackles efficient image restoration for real-world degraded images by coupling low-rank adaptation with Stable Diffusion XL within the SUPIR framework. It introduces two LoRA modules trained on domain-specific subsets to fine-tune SDXL, guided by a ControlNet adapter and LLaVA-generated prompts to preserve semantic content and texture in latent space. Across degradations, the approach yields higher PSNR, lower LPIPS, and improved SSIM compared to baselines, while achieving speedups from parameter-efficient fine-tuning and latent-diffusion processing. The findings suggest that targeted LoRA fine-tuning of large generative priors can deliver high-fidelity restorations with practical efficiency, though robustness under extreme degradation and broader datasets remain promising directions for future work.

Abstract

In this study, we propose an enhanced image restoration model, SUPIR, based on the integration of two low-rank adaptive (LoRA) modules with the Stable Diffusion XL (SDXL) framework. Our method leverages the advantages of LoRA to fine-tune SDXL models, thereby significantly improving image restoration quality and efficiency. We collect 2600 high-quality real-world images, each with detailed descriptive text, for training the model. The proposed method is evaluated on standard benchmarks and achieves excellent performance, demonstrated by higher peak signal-to-noise ratio (PSNR), lower learned perceptual image patch similarity (LPIPS), and higher structural similarity index measurement (SSIM) scores. These results underscore the effectiveness of combining LoRA with SDXL for advanced image restoration tasks, highlighting the potential of our approach in generating high-fidelity restored images.

Efficient Image Restoration through Low-Rank Adaptation and Stable Diffusion XL

TL;DR

The paper tackles efficient image restoration for real-world degraded images by coupling low-rank adaptation with Stable Diffusion XL within the SUPIR framework. It introduces two LoRA modules trained on domain-specific subsets to fine-tune SDXL, guided by a ControlNet adapter and LLaVA-generated prompts to preserve semantic content and texture in latent space. Across degradations, the approach yields higher PSNR, lower LPIPS, and improved SSIM compared to baselines, while achieving speedups from parameter-efficient fine-tuning and latent-diffusion processing. The findings suggest that targeted LoRA fine-tuning of large generative priors can deliver high-fidelity restorations with practical efficiency, though robustness under extreme degradation and broader datasets remain promising directions for future work.

Abstract

In this study, we propose an enhanced image restoration model, SUPIR, based on the integration of two low-rank adaptive (LoRA) modules with the Stable Diffusion XL (SDXL) framework. Our method leverages the advantages of LoRA to fine-tune SDXL models, thereby significantly improving image restoration quality and efficiency. We collect 2600 high-quality real-world images, each with detailed descriptive text, for training the model. The proposed method is evaluated on standard benchmarks and achieves excellent performance, demonstrated by higher peak signal-to-noise ratio (PSNR), lower learned perceptual image patch similarity (LPIPS), and higher structural similarity index measurement (SSIM) scores. These results underscore the effectiveness of combining LoRA with SDXL for advanced image restoration tasks, highlighting the potential of our approach in generating high-fidelity restored images.
Paper Structure (13 sections, 16 equations, 4 figures, 3 tables)

This paper contains 13 sections, 16 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Our model is an improvement on the SUPIR model, which has a significant restoration effect on all low-quality real-world images. In addition, the restoration effect on details is better than before.
  • Figure 2: The pipeline of our image restoration model. This figure briefly shows the workflow of the proposed model.
  • Figure 3: Compare with SUPIR. We apply a mixture of Gaussian blur with $\sigma=2$ and 4$\times$ downsampling for super-resolution degradation. Our method has a good restoration effect on facial details, such as scars. For the texture of hair and clothing, our model has a stronger effect than SUPIR.
  • Figure 4: Qualitative comparison with different methods. Our method can accurately restore the texture and details of the corresponding object under challenging degradation. Other methods may have deficiencies in presenting details, such as house, windows and clocks