Table of Contents
Fetching ...

Wasserstein Distortion: Unifying Fidelity and Realism

Yang Qiu, Aaron B. Wagner, Johannes Ballé, Lucas Theis

TL;DR

Wasserstein distortion addresses the tension between pixel-level fidelity and perceptual realism by introducing a local, HVS-inspired distortion metric that pools feature statistics around each image location and measures differences via the Wasserstein distance. Defined through a local feature map $\phi$, a pooling PMF $q_{\sigma}$, and a controllable pooling width $\sigma$, the global distortion $D$ blends fidelity and realism, with metric properties proven when $q_{\sigma}$ has a nonempty spectrum. The paper demonstrates this measure on texture synthesis and saliency-guided natural images, revealing smooth transitions from faithful reproduction to realistic but non-identical realizations and highlighting the framework’s potential for texture generation and perceptually aware compression. Overall, Wasserstein distortion offers a principled, optimizable metric that aligns with human visual perception by varying locality, enabling per-image realism assessments and practical applications in imaging and compression.

Abstract

We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasserstein distortion illustrate its utility. In particular, we generate random textures that have high fidelity to a reference texture in one location of the image and smoothly transition to an independent realization of the texture as one moves away from this point. Wasserstein distortion attempts to generalize and unify prior work on texture generation, image realism and distortion, and models of the early human visual system, in the form of an optimizable metric in the mathematical sense.

Wasserstein Distortion: Unifying Fidelity and Realism

TL;DR

Wasserstein distortion addresses the tension between pixel-level fidelity and perceptual realism by introducing a local, HVS-inspired distortion metric that pools feature statistics around each image location and measures differences via the Wasserstein distance. Defined through a local feature map , a pooling PMF , and a controllable pooling width , the global distortion blends fidelity and realism, with metric properties proven when has a nonempty spectrum. The paper demonstrates this measure on texture synthesis and saliency-guided natural images, revealing smooth transitions from faithful reproduction to realistic but non-identical realizations and highlighting the framework’s potential for texture generation and perceptually aware compression. Overall, Wasserstein distortion offers a principled, optimizable metric that aligns with human visual perception by varying locality, enabling per-image realism assessments and practical applications in imaging and compression.

Abstract

We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasserstein distortion illustrate its utility. In particular, we generate random textures that have high fidelity to a reference texture in one location of the image and smoothly transition to an independent realization of the texture as one moves away from this point. Wasserstein distortion attempts to generalize and unify prior work on texture generation, image realism and distortion, and models of the early human visual system, in the form of an optimizable metric in the mathematical sense.
Paper Structure (10 sections, 1 theorem, 15 equations, 9 figures)

This paper contains 10 sections, 1 theorem, 15 equations, 9 figures.

Key Result

Theorem 3.1

For any $0 \le \sigma < \infty$, if $d$ is a metric and $q_\sigma(\cdot)$ has no spectral nulls, then $D(\mathbf{z},\mathbf{z}')^{1/p}$ is a metric. If, in addition, $\phi(\cdot)$ is invertible then $D(\mathbf{x},\mathbf{x}')^{1/p}$ is also a metric.

Figures (9)

  • Figure 1: Receptive fields in the ventral stream grow with eccentricity.
  • Figure 2: A pictorial illustration of (\ref{['eq:ydef']}). In the right plot, the size of the disk indicates the probability mass and the vertical coordinate of the center of the disk indicates the value.
  • Figure 3: Examples showing that Wasserstein distortion does not satisfy positivity under a uniform PMF, where the red square in each image indicates the size of the pooling regions. The distortion between the two images on the left (A) is zero even if one uses the full Wasserstein distance in (\ref{['eq:ydef']}). If one uses MMD smola2006maximum as a proxy, then Wasserstein distortion with a uniform PMF is blind to certain blocking artifacts in that the two images on the right (B) have distortion zero. Compare Theorem \ref{['thm:metric']}. In both examples, $\phi(\cdot)$ is taken to be the coordinate map.
  • Figure 4: VGG-19 network structure.
  • Figure 5: The reference is on the left and the reproduction is on the right for each pair of images. The results are commensurate with dedicated texture generators.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem 3.1
  • proof