Universal Representations for Classification-enhanced Lossy Compression
Nam Nguyen
TL;DR
The paper tackles efficient lossy compression under multiple objectives by proposing universal representations: a single encoder that supports diverse decoding goals across rate–distortion–perception and rate–distortion–classification constraints. It formalizes these tradeoffs with RDPC functions and introduces a universal RDC framework where decoders are trained for different targets while a fixed encoder provides the shared representation, using dithered quantization and GAN/classifier regularization. Empirical results on MNIST show that universal encoders can closely match end-to-end performance for perception-related objectives, but face distortion penalties in the RDC setting when reused across tradeoffs; scaling the decoder tuning parameters helps mitigate this. Overall, the approach offers a practical path to reducing training cost and model redundancy, with potential extensions to higher-resolution image and video compression.
Abstract
In lossy compression, the classical tradeoff between compression rate and reconstruction distortion has traditionally guided algorithm design. However, Blau and Michaeli [5] introduced a generalized framework, known as the rate-distortion-perception (RDP) function, incorporating perceptual quality as an additional dimension of evaluation. More recently, the rate-distortion-classification (RDC) function was investigated in [19], evaluating compression performance by considering classification accuracy alongside distortion. In this paper, we explore universal representations, where a single encoder is developed to achieve multiple decoding objectives across various distortion and classification (or perception) constraints. This universality avoids retraining encoders for each specific operating point within these tradeoffs. Our experimental validation on the MNIST dataset indicates that a universal encoder incurs only minimal performance degradation compared to individually optimized encoders for perceptual image compression tasks, aligning with prior results from [23]. Nonetheless, we also identify that in the RDC setting, reusing an encoder optimized for one specific classification-distortion tradeoff leads to a significant distortion penalty when applied to alternative points.
