Idempotence and Perceptual Image Compression
Tongda Xu, Ziran Zhu, Dailan He, Yanghao Li, Lina Guo, Yuanyuan Wang, Zhe Wang, Hongwei Qin, Yan Wang, Jingjing Liu, Ya-Qin Zhang
TL;DR
This work uncovers a fundamental link between idempotence and perceptual image compression, showing that ideal conditional generative codecs are idempotent and that an unconditional generative model with an idempotence constraint is equivalent to a conditional codec. Building on this, the authors propose a practical inversion-based perceptual codec that leverages a pre-trained unconditional model and an existing MSE codec, without requiring new model training. They provide formal proofs and demonstrate empirically that their approach achieves state-of-the-art perceptual quality (lowest FID) across multiple datasets while preserving rate-distortion-perception guarantees and negotiation with the MSE baseline. While the method incurs higher test-time complexity due to gradient-based inversion, it avoids training multiple conditional models and offers a path to perception-distortion trade-offs using the same bitstream.
Abstract
Idempotence is the stability of image codec to re-compression. At the first glance, it is unrelated to perceptual image compression. However, we find that theoretically: 1) Conditional generative model-based perceptual codec satisfies idempotence; 2) Unconditional generative model with idempotence constraint is equivalent to conditional generative codec. Based on this newfound equivalence, we propose a new paradigm of perceptual image codec by inverting unconditional generative model with idempotence constraints. Our codec is theoretically equivalent to conditional generative codec, and it does not require training new models. Instead, it only requires a pre-trained mean-square-error codec and unconditional generative model. Empirically, we show that our proposed approach outperforms state-of-the-art methods such as HiFiC and ILLM, in terms of Fréchet Inception Distance (FID). The source code is provided in https://github.com/tongdaxu/Idempotence-and-Perceptual-Image-Compression.
