Wasserstein Distortion: Unifying Fidelity and Realism
Yang Qiu, Aaron B. Wagner, Johannes Ballé, Lucas Theis
TL;DR
Wasserstein distortion addresses the tension between pixel-level fidelity and perceptual realism by introducing a local, HVS-inspired distortion metric that pools feature statistics around each image location and measures differences via the Wasserstein distance. Defined through a local feature map $\phi$, a pooling PMF $q_{\sigma}$, and a controllable pooling width $\sigma$, the global distortion $D$ blends fidelity and realism, with metric properties proven when $q_{\sigma}$ has a nonempty spectrum. The paper demonstrates this measure on texture synthesis and saliency-guided natural images, revealing smooth transitions from faithful reproduction to realistic but non-identical realizations and highlighting the framework’s potential for texture generation and perceptually aware compression. Overall, Wasserstein distortion offers a principled, optimizable metric that aligns with human visual perception by varying locality, enabling per-image realism assessments and practical applications in imaging and compression.
Abstract
We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasserstein distortion illustrate its utility. In particular, we generate random textures that have high fidelity to a reference texture in one location of the image and smoothly transition to an independent realization of the texture as one moves away from this point. Wasserstein distortion attempts to generalize and unify prior work on texture generation, image realism and distortion, and models of the early human visual system, in the form of an optimizable metric in the mathematical sense.
