Table of Contents
Fetching ...

Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics

Dror Freirich, Nir Weinberger, Ron Meir

TL;DR

We address the distortion-perception tradeoff in finite-alphabet channels by formulating the problem with a Wasserstein-1 perception index and a general distortion matrix. The DP function $D(P)$ is shown to be computable via linear programming and, in the discrete setting, is necessarily piecewise linear with a finite set of breakpoints. A dual OT-based characterization yields a structural understanding: the DP curve is the upper envelope of a finite family of linear functions, with breakpoints tied to dual-vertex projections. For binary sources we derive a closed-form expression, revealing explicit breakpoints and linear segments, enabling exact, efficient computation of the DP curve. These results unify and extend prior work on rate-distortion-perception and discrete DP, with practical implications for constructing perceptually constrained reconstructions via simple breakpoint-based strategies.

Abstract

Whenever inspected by humans, reconstructed signals should not be distinguished from real ones. Typically, such a high perceptual quality comes at the price of high reconstruction error, and vice versa. We study this distortion-perception (DP) tradeoff over finite-alphabet channels, for the Wasserstein-$1$ distance induced by a general metric as the perception index, and an arbitrary distortion matrix. Under this setting, we show that computing the DP function and the optimal reconstructions is equivalent to solving a set of linear programming problems. We provide a structural characterization of the DP tradeoff, where the DP function is piecewise linear in the perception index. We further derive a closed-form expression for the case of binary sources.

Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics

TL;DR

We address the distortion-perception tradeoff in finite-alphabet channels by formulating the problem with a Wasserstein-1 perception index and a general distortion matrix. The DP function is shown to be computable via linear programming and, in the discrete setting, is necessarily piecewise linear with a finite set of breakpoints. A dual OT-based characterization yields a structural understanding: the DP curve is the upper envelope of a finite family of linear functions, with breakpoints tied to dual-vertex projections. For binary sources we derive a closed-form expression, revealing explicit breakpoints and linear segments, enabling exact, efficient computation of the DP curve. These results unify and extend prior work on rate-distortion-perception and discrete DP, with practical implications for constructing perceptually constrained reconstructions via simple breakpoint-based strategies.

Abstract

Whenever inspected by humans, reconstructed signals should not be distinguished from real ones. Typically, such a high perceptual quality comes at the price of high reconstruction error, and vice versa. We study this distortion-perception (DP) tradeoff over finite-alphabet channels, for the Wasserstein- distance induced by a general metric as the perception index, and an arbitrary distortion matrix. Under this setting, we show that computing the DP function and the optimal reconstructions is equivalent to solving a set of linear programming problems. We provide a structural characterization of the DP tradeoff, where the DP function is piecewise linear in the perception index. We further derive a closed-form expression for the case of binary sources.
Paper Structure (15 sections, 10 theorems, 57 equations, 2 figures)

This paper contains 15 sections, 10 theorems, 57 equations, 2 figures.

Key Result

Theorem 1

(The perception-distortion tradeoff). If $d_p(p, q)$ is convex in its second argument, then the distortion-perception function APP_eq:D_P::General_definition is monotonically non-increasing and convex.

Figures (2)

  • Figure 1: The distortion-perception (DP) function.(Left) The minimal distortion possible for a certain level of perceptual quality forms a convex, non-increasing curve. The region below the curve can not be attained by any reconstruction method. (Right) In our discrete setting, $D(P)$ is a piecewise linear function. Breakpoints $P^*_i$ and slopes $2u_i$ are given explicitly by Theorem \ref{['thm::DP:binarysource']} for binary sources.
  • Figure 2: Numerical illustration of Theorem \ref{['thm:piecewiseLinear']} and Theorem \ref{['thm::r2convex']} for $| \mathcal{X} |=3, | \mathcal{Y} |=5$. In the (Middle) pane we present the set $\mathcal{S}_2$ and its convex hull in the $(p_0,p_1)$-plane. The (Right) pane shows the optimal solutions obtained by numerically solving \ref{['eq:DPdiscrete_explicit']} for different values of $P$. We can see that the solutions, corresponding to the linear segments of $D(P)$ ( Left pane), occur at extreme points of $\mathrm{conv}\left(\mathcal{S}_{2}\right)$.

Theorems & Definitions (16)

  • Theorem 1
  • Proposition 2
  • proof
  • Proposition 3
  • Remark 4
  • Lemma 5
  • Theorem 6
  • proof : Proof of Theorem \ref{['thm:piecewiseLinear']}
  • Corollary 7
  • Theorem 8
  • ...and 6 more