Table of Contents
Fetching ...

When No-Reference Image Quality Models Meet MAP Estimation in Diffusion Latents

Weixia Zhang, Dingquan Li, Guangtao Zhai, Xiaokang Yang, Kede Ma

TL;DR

This paper addresses the challenge of using no-reference image quality assessment (NR-IQA) models for perceptual optimization in real-world image enhancement. It introduces diffusion latent MAP estimation, which augments NR-IQA models with a differentiable bijective diffusion transform (EDICT) to enable optimization in diffusion latent space, and then applies MAP where the fidelity term and NR-IQA prior balance image fidelity and naturalness. By evaluating eight NR-IQA models through debiased psychophysical testing and comparing their outputs, the authors reveal complementary strengths and develop an improved NR-IQA model, MS-LIQE, that better enhances real-world images while preserving fidelity. The diffusion-latent MAP framework provides a new analysis-by-synthesis paradigm for NR-IQA model comparison and offers a practical post-enhancement step to refine outputs from other real-world image enhancers, with potential for generative NR-IQA-based perceptual optimization.

Abstract

Contemporary no-reference image quality assessment (NR-IQA) models can effectively quantify perceived image quality, often achieving strong correlations with human perceptual scores on standard IQA benchmarks. Yet, limited efforts have been devoted to treating NR-IQA models as natural image priors for real-world image enhancement, and consequently comparing them from a perceptual optimization standpoint. In this work, we show -- for the first time -- that NR-IQA models can be plugged into the maximum a posteriori (MAP) estimation framework for image enhancement. This is achieved by performing gradient ascent in the diffusion latent space rather than in the raw pixel domain, leveraging a pretrained differentiable and bijective diffusion process. Likely, different NR-IQA models lead to different enhanced outputs, which in turn provides a new computational means of comparing them. Unlike conventional correlation-based measures, our comparison method offers complementary insights into the respective strengths and weaknesses of the competing NR-IQA models in perceptual optimization scenarios. Additionally, we aim to improve the best-performing NR-IQA model in diffusion latent MAP estimation by incorporating the advantages of other top-performing methods. The resulting model delivers noticeably better results in enhancing real-world images afflicted by unknown and complex distortions, all preserving a high degree of image fidelity.

When No-Reference Image Quality Models Meet MAP Estimation in Diffusion Latents

TL;DR

This paper addresses the challenge of using no-reference image quality assessment (NR-IQA) models for perceptual optimization in real-world image enhancement. It introduces diffusion latent MAP estimation, which augments NR-IQA models with a differentiable bijective diffusion transform (EDICT) to enable optimization in diffusion latent space, and then applies MAP where the fidelity term and NR-IQA prior balance image fidelity and naturalness. By evaluating eight NR-IQA models through debiased psychophysical testing and comparing their outputs, the authors reveal complementary strengths and develop an improved NR-IQA model, MS-LIQE, that better enhances real-world images while preserving fidelity. The diffusion-latent MAP framework provides a new analysis-by-synthesis paradigm for NR-IQA model comparison and offers a practical post-enhancement step to refine outputs from other real-world image enhancers, with potential for generative NR-IQA-based perceptual optimization.

Abstract

Contemporary no-reference image quality assessment (NR-IQA) models can effectively quantify perceived image quality, often achieving strong correlations with human perceptual scores on standard IQA benchmarks. Yet, limited efforts have been devoted to treating NR-IQA models as natural image priors for real-world image enhancement, and consequently comparing them from a perceptual optimization standpoint. In this work, we show -- for the first time -- that NR-IQA models can be plugged into the maximum a posteriori (MAP) estimation framework for image enhancement. This is achieved by performing gradient ascent in the diffusion latent space rather than in the raw pixel domain, leveraging a pretrained differentiable and bijective diffusion process. Likely, different NR-IQA models lead to different enhanced outputs, which in turn provides a new computational means of comparing them. Unlike conventional correlation-based measures, our comparison method offers complementary insights into the respective strengths and weaknesses of the competing NR-IQA models in perceptual optimization scenarios. Additionally, we aim to improve the best-performing NR-IQA model in diffusion latent MAP estimation by incorporating the advantages of other top-performing methods. The resulting model delivers noticeably better results in enhancing real-world images afflicted by unknown and complex distortions, all preserving a high degree of image fidelity.
Paper Structure (19 sections, 20 equations, 12 figures, 1 table, 1 algorithm)

This paper contains 19 sections, 20 equations, 12 figures, 1 table, 1 algorithm.

Figures (12)

  • Figure 1: (a) Distorted image used as the initial point. (b) Optimized image obtained by directly maximizing a state-of-the-art NR-IQA model, LIQE zhang2023blind (see Eq. \ref{['eq:eq1']}). (c) Optimized image generated via MAP estimation (see Eq. \ref{['eq:eq2']}), where the likelihood term is implemented by the mean squared error (MSE) and LIQE zhang2023blind is employed as the prior. (d) Optimized image produced by diffusion latent MAP estimation (see Eq. \ref{['eq:mapdl']}). Below each image is its LIQE-predicted quality score, which ranges up to a maximum of five. A larger value indicates higher predicted quality.
  • Figure 2: Optimized images via MAP estimation (see Eq. \ref{['eq:eq2']}) for various values of $\lambda$.
  • Figure 3: (a) System diagram of diffusion latent MAP estimation. (b) NR-IQA model comparison and improvement enabled by the proposed diffusion latent MAP estimation framework.
  • Figure 4: Representative test photographic images to be enhanced.
  • Figure 5: Global ranking scores of eight NR-IQA models using diffusion latent MAP estimation (higher scores indicate better performance). Models grouped within the same colored box have statistically indistinguishable performance, according to a two-tailed $t$-test.
  • ...and 7 more figures