Efficient Image Restoration through Low-Rank Adaptation and Stable Diffusion XL
Haiyang Zhao
TL;DR
The paper tackles efficient image restoration for real-world degraded images by coupling low-rank adaptation with Stable Diffusion XL within the SUPIR framework. It introduces two LoRA modules trained on domain-specific subsets to fine-tune SDXL, guided by a ControlNet adapter and LLaVA-generated prompts to preserve semantic content and texture in latent space. Across degradations, the approach yields higher PSNR, lower LPIPS, and improved SSIM compared to baselines, while achieving speedups from parameter-efficient fine-tuning and latent-diffusion processing. The findings suggest that targeted LoRA fine-tuning of large generative priors can deliver high-fidelity restorations with practical efficiency, though robustness under extreme degradation and broader datasets remain promising directions for future work.
Abstract
In this study, we propose an enhanced image restoration model, SUPIR, based on the integration of two low-rank adaptive (LoRA) modules with the Stable Diffusion XL (SDXL) framework. Our method leverages the advantages of LoRA to fine-tune SDXL models, thereby significantly improving image restoration quality and efficiency. We collect 2600 high-quality real-world images, each with detailed descriptive text, for training the model. The proposed method is evaluated on standard benchmarks and achieves excellent performance, demonstrated by higher peak signal-to-noise ratio (PSNR), lower learned perceptual image patch similarity (LPIPS), and higher structural similarity index measurement (SSIM) scores. These results underscore the effectiveness of combining LoRA with SDXL for advanced image restoration tasks, highlighting the potential of our approach in generating high-fidelity restored images.
