Table of Contents
Fetching ...

Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model

Yuhan Wang, Suzhi Bi, Ying-Jun Angela Zhang, Xiaojun Yuan

TL;DR

This paper tackles the distortion-perception (DP) tradeoff in inverse problems by introducing a variance-scaled reverse diffusion framework that uses a single pre-trained score-based model to flexibly traverse the DP curve. The authors characterize the marginal distributions under scaled reverse diffusion, proving optimality for conditional Gaussian cases and showing that end-point mean aligns with the MMSE solution while the end-point covariance scales with the variance factor. They implement a practical sampling procedure via DPS to approximate conditional scores, enabling inference-time control over the DP tradeoff without retraining and demonstrating superior DP coverage on 2D distributions and FFHQ images compared with GAN-based and diffusion-based baselines. The results suggest that a single score network can robustly handle varying measurements and noise levels, providing a flexible and scalable approach for general inverse problems such as Gaussian deblurring and strong super-resolution. Overall, the work advances efficient, principled DP navigation in diffusion-based restoration, with potential impact on real-time image enhancement and adaptable denoising frameworks.

Abstract

The distortion-perception (DP) tradeoff reveals a fundamental conflict between distortion metrics (e.g., MSE and PSNR) and perceptual quality. Recent research has increasingly concentrated on evaluating denoising algorithms within the DP framework. However, existing algorithms either prioritize perceptual quality by sacrificing acceptable distortion, or focus on minimizing MSE for faithful restoration. When the goal shifts or noisy measurements vary, adapting to different points on the DP plane needs retraining or even re-designing the model. Inspired by recent advances in solving inverse problems using score-based generative models, we explore the potential of flexibly and optimally traversing DP tradeoffs using a single pre-trained score-based model. Specifically, we introduce a variance-scaled reverse diffusion process and theoretically characterize the marginal distribution. We then prove that the proposed sample process is an optimal solution to the DP tradeoff for conditional Gaussian distribution. Experimental results on two-dimensional and image datasets illustrate that a single score network can effectively and flexibly traverse the DP tradeoff for general denoising problems.

Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model

TL;DR

This paper tackles the distortion-perception (DP) tradeoff in inverse problems by introducing a variance-scaled reverse diffusion framework that uses a single pre-trained score-based model to flexibly traverse the DP curve. The authors characterize the marginal distributions under scaled reverse diffusion, proving optimality for conditional Gaussian cases and showing that end-point mean aligns with the MMSE solution while the end-point covariance scales with the variance factor. They implement a practical sampling procedure via DPS to approximate conditional scores, enabling inference-time control over the DP tradeoff without retraining and demonstrating superior DP coverage on 2D distributions and FFHQ images compared with GAN-based and diffusion-based baselines. The results suggest that a single score network can robustly handle varying measurements and noise levels, providing a flexible and scalable approach for general inverse problems such as Gaussian deblurring and strong super-resolution. Overall, the work advances efficient, principled DP navigation in diffusion-based restoration, with potential impact on real-time image enhancement and adaptable denoising frameworks.

Abstract

The distortion-perception (DP) tradeoff reveals a fundamental conflict between distortion metrics (e.g., MSE and PSNR) and perceptual quality. Recent research has increasingly concentrated on evaluating denoising algorithms within the DP framework. However, existing algorithms either prioritize perceptual quality by sacrificing acceptable distortion, or focus on minimizing MSE for faithful restoration. When the goal shifts or noisy measurements vary, adapting to different points on the DP plane needs retraining or even re-designing the model. Inspired by recent advances in solving inverse problems using score-based generative models, we explore the potential of flexibly and optimally traversing DP tradeoffs using a single pre-trained score-based model. Specifically, we introduce a variance-scaled reverse diffusion process and theoretically characterize the marginal distribution. We then prove that the proposed sample process is an optimal solution to the DP tradeoff for conditional Gaussian distribution. Experimental results on two-dimensional and image datasets illustrate that a single score network can effectively and flexibly traverse the DP tradeoff for general denoising problems.

Paper Structure

This paper contains 27 sections, 3 theorems, 58 equations, 14 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

For VP-diffusion and $0\leq \lambda\leq 1$, consider the joint inference distribution given by where $p_{\lambda}(\mathbf x_T|\mathbf y)=\mathcal{N}(0,\mathbf I)$ and $p_{\lambda}(\mathbf x_{k}|\mathbf x_{k+1},\mathbf y)$ is given by reverse_dist_lambda. Then, the corresponding margin has the distribution $p_{\lambda}(\mathbf x_k|\mathbf y) = \mathcal{N}(\boldsymbol{\mu}_k^{\lambda}, \mathbf whe

Figures (14)

  • Figure 1: An example of mixture Gaussian, the noisy observation, and the conditional distribution given an observation $y=-0.6$
  • Figure 2: Trajectories $x_T\to \cdots x_k\to\cdots\to x_0$ of different reconstructions for $\lambda=0, 0.3, 0.8$ and $1$. The initial $x_T$ is $\mathcal{N}(0,1)$.
  • Figure 3: Distortion-perception tradeoff traversed by variance-scaled reverse sampling given different $\lambda$'s. (a) tradeoff between Wasserstein-2 distance and MSE; (b) tradeoff between KL-divergence and MSE.
  • Figure 4: Experiments on a two-dimensional dataset. The first row illustrates the original distribution (left) of pinwheel and the noisy observation $Y$, where $Y=aX+N$ for $N\sim\mathcal{N}(0,\sigma_n^2\mathbf{I})$ (right). The second and third row shows the reconstructions: (a)-(e) variance-scaled reverse diffusion process with different $\lambda$'s; (f) PSCGAN with $N=16,\sigma_z=1$.
  • Figure 5: DP tradeoff on pinwheel dataset traversed by our variance-scaled reverse diffusion process and PSCGAN.
  • ...and 9 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • Remark 1
  • Lemma 3