Table of Contents
Fetching ...

Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration

Fengyang Xiao, Peng Hu, Lei Xu, XingE Guo, Guanyi Qin, Yuqi Shen, Chengyu Fang, Rihan Zhang, Chunming He, Sina Farsiu

Abstract

Real-world image restoration aims to restore high-quality (HQ) images from degraded low-quality (LQ) inputs captured under uncontrolled conditions. Existing methods typically depend on ground-truth (GT) supervision, assuming that GT provides perfect reference quality. However, GT can still contain images with inconsistent perceptual fidelity, causing models to converge to the average quality level of the training data rather than achieving the highest perceptual quality attainable. To address these problems, we propose a novel framework, termed IQPIR, that introduces an Image Quality Prior (IQP)-extracted from pre-trained No-Reference Image Quality Assessment (NR-IQA) models-to guide the restoration process toward perceptually optimal outputs explicitly. Our approach synergistically integrates IQP with a learned codebook prior through three key mechanisms: (1) a quality-conditioned Transformer, where NR-IQA-derived scores serve as conditioning signals to steer the predicted representation toward maximal perceptual quality. This design provides a plug-and-play enhancement compatible with existing restoration architectures without structural modification; and (2) a dual-branch codebook structure, which disentangles common and HQ-specific features, ensuring a comprehensive representation of both generic structural information and quality-sensitive attributes; and (3) a discrete representation-based quality optimization strategy, which mitigates over-optimization effects commonly observed in continuous latent spaces. Extensive experiments on real-world image restoration demonstrate that our method not only surpasses cutting-edge methods but also serves as a generalizable quality-guided enhancement strategy for existing methods. The code is available.

Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration

Abstract

Real-world image restoration aims to restore high-quality (HQ) images from degraded low-quality (LQ) inputs captured under uncontrolled conditions. Existing methods typically depend on ground-truth (GT) supervision, assuming that GT provides perfect reference quality. However, GT can still contain images with inconsistent perceptual fidelity, causing models to converge to the average quality level of the training data rather than achieving the highest perceptual quality attainable. To address these problems, we propose a novel framework, termed IQPIR, that introduces an Image Quality Prior (IQP)-extracted from pre-trained No-Reference Image Quality Assessment (NR-IQA) models-to guide the restoration process toward perceptually optimal outputs explicitly. Our approach synergistically integrates IQP with a learned codebook prior through three key mechanisms: (1) a quality-conditioned Transformer, where NR-IQA-derived scores serve as conditioning signals to steer the predicted representation toward maximal perceptual quality. This design provides a plug-and-play enhancement compatible with existing restoration architectures without structural modification; and (2) a dual-branch codebook structure, which disentangles common and HQ-specific features, ensuring a comprehensive representation of both generic structural information and quality-sensitive attributes; and (3) a discrete representation-based quality optimization strategy, which mitigates over-optimization effects commonly observed in continuous latent spaces. Extensive experiments on real-world image restoration demonstrate that our method not only surpasses cutting-edge methods but also serves as a generalizable quality-guided enhancement strategy for existing methods. The code is available.

Paper Structure

This paper contains 12 sections, 10 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Score distributions given by different Image Quality Assessment (IQA) models across various ground truth datasets, where higher scores indicate better image quality. Notably, most ground-truth images exhibit an average quality score between 5 and 8, with some residual degradation patterns. This limits the network’s ability to output truly high-quality images, achieving scores closer to 9.
  • Figure 2: Structural comparison of different training paradigms, where both (b) and our (c) utilize IQA to guide network training.
  • Figure 3: Results on blind face restoration. DifFace+ is the DifFace yue2022difface with our quality prior conditioned approach, having more details. Our IQPIR has the highest perceptual quality.
  • Figure 4: Overall framework of IQPIR. (a) In the codebook learning stage, a dual-codebook architecture is proposed. The HQ+ codebook is learned to quantize $Z_h$ only when the quality score of $x_h$ is higher than the threshold $S_{thr}$. (b) In the codebook lookup stage, we input the quality score $S$ as a condition into Transformer T, which predicts two code sequences at the same time. The two codebooks are leveraged to look up the corresponding code entries. Finally, the NR-IQA model is utilized to calculate the quality loss $\mathcal{L}_{quality}$.
  • Figure 5: Visualizations on low light, underwater and backlit image restoration.
  • ...and 4 more figures