Rethinking Learned Image Compression: Context is All You Need
Jixiang Luo
TL;DR
The paper investigates the boundary of learned image compression (LIC) under PSNR-based rate-distortion by dissecting how scaling encoder, decoder, context, and dataset affect performance. It finds that gains mainly come from expanding the context model and, to some extent, the decoder, with overfitting on the Kodak dataset acting as a powerful context signal that yields substantial BD-RATE improvements over VVC. Introducing a ChannelAttenBlock and adaptive quantization, the authors show that context-driven RD improvements dominate, though excessive context capacity can hurt PSNR, revealing a nuanced trade-off along the RD frontier. The work highlights data-driven context modeling as the key lever for LIC gains, while also emphasizing generalization risks when relying on overfitted, dataset-specific context. Practically, this suggests directions for constructing more scalable LIC systems that balance context richness, dataset breadth, and entropy modeling to approach the LIC boundary more robustly.
Abstract
Since LIC has made rapid progress recently compared to traditional methods, this paper attempts to discuss the question about 'Where is the boundary of Learned Image Compression(LIC)?'. Thus this paper splits the above problem into two sub-problems:1)Where is the boundary of rate-distortion performance of PSNR? 2)How to further improve the compression gain and achieve the boundary? Therefore this paper analyzes the effectiveness of scaling parameters for encoder, decoder and context model, which are the three components of LIC. Then we conclude that scaling for LIC is to scale for context model and decoder within LIC. Extensive experiments demonstrate that overfitting can actually serve as an effective context. By optimizing the context, this paper further improves PSNR and achieves state-of-the-art performance, showing a performance gain of 14.39% with BD-RATE over VVC.
