Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure
Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu
TL;DR
This paper addresses the rate-distortion-perception (RDP) tradeoff for memoryless sources under a perception measure defined by the conditional distribution of the source given the encoder output, focusing on the no-shared-randomness setting. It derives a single-letter characterization of the RDP function for finite alphabets and extends it to continuous alphabets with squared-error distortion and squared Wasserstein perception, showing the decoder can be implemented as a noise-adding operation on the MMSE estimate. Key contributions include the equality $R_{ ext{C}}(D,P)=R(D,P)$ for discrete sources, a closed-form Bernoulli result with an envelope correction, a continuous-alphabet equivalence with a practical representation via $U'= ext{E}[X|U]$, and a Gaussian-vector reverse-waterfilling solution; Gaussian mixtures receive partial treatment. The results provide a tractable framework for designing perceptually realistic compression without shared randomness and connect to existing marginal-perception findings, with implications for Gaussian-source coding and potential extensions to networks and neural compression systems.
Abstract
This paper studies the rate-distortion-perception (RDP) tradeoff for a memoryless source model in the asymptotic limit of large block-lengths. The perception measure is based on a divergence between the distributions of the source and reconstruction sequences \emph{conditioned} on the encoder output, first proposed by Mentzer et al. We consider the case when there is no shared randomness between the encoder and the decoder and derive a single-letter characterization of the RDP function for the case of discrete memoryless sources. This is in contrast to the marginal-distribution metric case (introduced by Blau and Michaeli), whose RDP characterization remains open when there is no shared randomness. The achievability scheme is based on lossy source coding with a posterior reference map. For the case of continuous valued sources under the squared error distortion measure and the squared quadratic Wasserstein perception measure, we also derive a single-letter characterization and show that the decoder can be restricted to a noise-adding mechanism. Interestingly, the RDP function characterized for the case of zero perception loss coincides with that of the marginal metric, and further zero perception loss can be achieved with a 3-dB penalty in minimum distortion. Finally we specialize to the case of Gaussian sources, and derive the RDP function for Gaussian vector case and propose a reverse water-filling type solution. We also partially characterize the RDP function for a mixture of Gaussian vector sources.
