Training-Free Rate-Distortion-Perception Traversal With Diffusion

Yuhan Wang; Suzhi Bi; Ying-Jun Angela Zhang

Training-Free Rate-Distortion-Perception Traversal With Diffusion

Yuhan Wang, Suzhi Bi, Ying-Jun Angela Zhang

TL;DR

A training-free framework that leverages pre-trained diffusion models to traverse the entire RDP surface and theoretically proves that the proposed diffusion decoder is optimal for the distortion-perception tradeoff under AWGN observations and that the overall framework achieves the optimal RDP function in the Gaussian case.

Abstract

The rate-distortion-perception (RDP) tradeoff characterizes the fundamental limits of lossy compression by jointly considering bitrate, reconstruction fidelity, and perceptual quality. While recent neural compression methods have improved perceptual performance, they typically operate at fixed points on the RDP surface, requiring retraining to target different tradeoffs. In this work, we propose a training-free framework that leverages pre-trained diffusion models to traverse the entire RDP surface. Our approach integrates a reverse channel coding (RCC) module with a novel score-scaled probability flow ODE decoder. We theoretically prove that the proposed diffusion decoder is optimal for the distortion-perception tradeoff under AWGN observations and that the overall framework with the RCC module achieves the optimal RDP function in the Gaussian case. Empirical results across multiple datasets demonstrate the framework's flexibility and effectiveness in navigating the ternary RDP tradeoff using pre-trained diffusion models. Our results establish a practical and theoretically grounded approach to adaptive, perception-aware compression.

Training-Free Rate-Distortion-Perception Traversal With Diffusion

TL;DR

Abstract

Paper Structure (33 sections, 7 theorems, 99 equations, 17 figures, 3 tables, 2 algorithms)

This paper contains 33 sections, 7 theorems, 99 equations, 17 figures, 3 tables, 2 algorithms.

Introduction
Background
Diffusion Models and Probability Flow ODE:
Reverse Channel Coding
A Rate-Distortion Analysis of Current DiffC
Score-Scaled Probability Flow ODE
Conditional and Marginal Distributions of Score-Scaled PF-ODE
Optimal Distortion-Perception Tradeoff Through AWGN Channel
Traversing RDP Function: Optimality and General Algorithm
Experimental Results
CIFAR-10 Dataset
Kodak and DIV2K Datasets
Conclusions
On RDP and DP Tradeoffs
Rate-Distortion-Perception Tradeoff in Lossy Compression
...and 18 more sections

Key Result

Lemma 1

Consider the multivariate Gaussian source $X\sim \mathcal{N}(\boldsymbol\mu_0,\boldsymbol\Sigma_0)$. Let $\boldsymbol{\mu}_k = \sqrt{\bar{\alpha}_k}\boldsymbol{\mu}_0$ and $\boldsymbol{\Sigma}_k=\bar{\alpha}_k\boldsymbol\Sigma_0+(1-\bar{\alpha}_k)\mathbf I~$ for $k\in\{1, \dots, t\}$. Starting from Meanwhile, when $\rho=0$, the variance is $\bar{\alpha}_t\boldsymbol{\Sigma}_0^2\boldsymbol{\Sigma}

Figures (17)

Figure 1: The proposed framework to traverse the RDP function using pre-trained diffusion models.
Figure 2: Information-theoretical RDP function for scalar Gaussian source (dashed line) and achieved rate, MSE, and W2 distance levels by our scheme (solid dots). (a) The RDP surface. (b) $R(D,P)$ function along DP planes. Different colors represent different rates.
Figure 3: Effect of controlling $t$ and $\rho$ on different metrics for the CIFAR-10 dataset. Distortion is quantified by MSE, and perception is measured by LPIPS and FID.
Figure 4: Rate-distortion-perception curves on the CIFAR-10 dataset. Distortion levels are quantified by MSE and perception levels are measured by LPIPS.
Figure 5: RDP tradeoff traversed by our proposed scheme on the Kodak and DIV2K datasets. We show the results obtained with Stable Diffusion (SD) 2.1 and the Flux model, respectively. More tradeoffs measured in different metrics (e.g., PSNR and FID) can be found in Appendix \ref{['App-subsec-KD-more-results']}.
...and 12 more figures

Theorems & Definitions (16)

Remark 1
Lemma 1
proof
Proposition 2: DP-tradeoff_Wasserstein_Freirich2021
proof
Theorem 3
proof
Remark 2
Theorem 4
proof
...and 6 more

Training-Free Rate-Distortion-Perception Traversal With Diffusion

TL;DR

Abstract

Training-Free Rate-Distortion-Perception Traversal With Diffusion

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (17)

Theorems & Definitions (16)