Rate-Distortion-Perception Theory for the Quadratic Wasserstein Space
Xiqiang Qu, Jun Chen, Lei Yu, Xiangyu Xu
TL;DR
The paper tackles the distortion-rate-perception tradeoff in lossy source coding under squared-error distortion and squared Wasserstein-2 perception, with a cap on common randomness. It derives a unified single-letter characterization $D^*(R,C,P)=D(R,C,P)$, where $D(R,C,P)$ minimizes a sum of a conventional distortion term and a Wasserstein-based perceptual penalty under mutual information constraints, and proves achievability via soft-covering with Wasserstein distance and a linear interpolation decoder. In the Gaussian setting, it provides explicit formulas for scalar and vector sources, showing that a universal representation exists for scalar sources with finite common randomness, while such universality may fail for vector sources when $C>0$. The results unify and extend prior extremes (no randomness, unlimited randomness, perfect perception) and have practical implications for designing perception-aware coding schemes using tools like nested lattice quantization and optimal transport. They also illuminate the role of common randomness in perception-constrained performance and open avenues for extending to other distortion/perception measures and to more general distribution families.
Abstract
We establish a single-letter characterization of the fundamental distortion-rate-perception tradeoff with limited common randomness under the squared error distortion measure and the squared Wasserstein-2 perception measure. Moreover, it is shown that this single-letter characterization can be explicitly evaluated for the Gaussian source. Various notions of universal representation are also clarified.
