Adapting Learned Image Codecs to Screen Content via Adjustable Transformations
H. Burak Dogaroglu, A. Burakhan Koyuncu, Atanas Boev, Elena Alshina, Eckehard Steinbach
TL;DR
Learned image codecs struggle with screen content due to distribution shifts. The authors wrap a fixed codec with parameterized, invertible linear transforms and two neural modules (pre-filter and post-filter) to adapt LICs to screen content without retraining the codec, achieving end-to-end learning. By focusing on a desaturation-based forward transform plus CR/RS modules, they report up to $10\%$ BD-Rate savings on SC data with only about $1\%$ extra parameters, across multiple baseline LICs; results emphasize compatibility, modest compute overhead, and broad applicability. This approach provides a practical pathway to extend LIC performance to specialized domains and invites future work on domain-specific transforms and per-image coefficient optimization.
Abstract
As learned image codecs (LICs) become more prevalent, their low coding efficiency for out-of-distribution data becomes a bottleneck for some applications. To improve the performance of LICs for screen content (SC) images without breaking backwards compatibility, we propose to introduce parameterized and invertible linear transformations into the coding pipeline without changing the underlying baseline codec's operation flow. We design two neural networks to act as prefilters and postfilters in our setup to increase the coding efficiency and help with the recovery from coding artifacts. Our end-to-end trained solution achieves up to 10% bitrate savings on SC compression compared to the baseline LICs while introducing only 1% extra parameters.
