Rate-Distortion-Perception Tradeoff for Gaussian Vector Sources
Jingjing Qian, Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu, Wuxian Shi, Yiqun Ge, Wen Tong
TL;DR
This work extends the rate-distortion-perception (RDP) framework to Gaussian vector sources under quadratic distortion and KL or Wasserstein-2 perception losses. It proves that jointly Gaussian reconstructions are optimal and derives a convex, per-component optimization that yields the RDP function, revealing that active perception forces positive, componentwise rates with unequal water levels, unlike traditional reverse water-filling. For both KL and Wasserstein-2 metrics, the optimal solution admits a generalized water-filling interpretation, with explicit conditions and, in some cases, closed-form characterizations of the per-component parameters. The perceptually perfect reconstruction case (P=0) provides clear structural insights and asymptotic behavior in high- and low-distortion regimes, offering guidance for perception-aware multivariate coding in practice.
Abstract
This paper studies the rate-distortion-perception (RDP) tradeoff for a Gaussian vector source coding problem where the goal is to compress the multi-component source subject to distortion and perception constraints. Specifically, the RDP setting with either the Kullback-Leibler (KL) divergence or Wasserstein-2 metric as the perception loss function is examined, and it is shown that for Gaussian vector sources, jointly Gaussian reconstructions are optimal. We further demonstrate that the optimal tradeoff can be expressed as an optimization problem, which can be explicitly solved. An interesting property of the optimal solution is as follows. Without the perception constraint, the traditional reverse water-filling solution for characterizing the rate-distortion (RD) tradeoff of a Gaussian vector source states that the optimal rate allocated to each component depends on a constant, called the water level. If the variance of a specific component is below the water level, it is assigned a zero compression rate. However, with active distortion and perception constraints, we show that the optimal rates allocated to the different components are always positive. Moreover, the water levels that determine the optimal rate allocation for different components are unequal. We further treat the special case of perceptually perfect reconstruction and study its RDP function in the high-distortion and low-distortion regimes to obtain insight to the structure of the optimal solution.
