Few-Shot Domain Adaptation for Learned Image Compression
Tianyu Zhang, Haotian Zhang, Yuqi Li, Li Li, Dong Liu
TL;DR
This paper tackles the poor generalization of pre-trained learned image compression (LIC) models to out-of-domain images. It introduces a universal few-shot domain adaptation framework that injects compact adapters—Conv-Adapters for latent channel reallocation and LoRA-Adapters for the entropy model—into existing LIC architectures and trains them via a two-stage strategy using only a small target-domain sample set. The approach achieves RD performance close to H.266/VVC intra coding across multiple domains and LIC schemes, while transmitting less than 2% of parameters and incurring minimal decoding-time overhead; it even matches full-model finetuning performance with far fewer trainable parameters. These results suggest practical deployment of LIC in diverse real-world domains and demonstrate the viability of lightweight, plug-and-play adaptation for learned codecs across varied visual domains.
Abstract
Learned image compression (LIC) has achieved state-of-the-art rate-distortion performance, deemed promising for next-generation image compression techniques. However, pre-trained LIC models usually suffer from significant performance degradation when applied to out-of-training-domain images, implying their poor generalization capabilities. To tackle this problem, we propose a few-shot domain adaptation method for LIC by integrating plug-and-play adapters into pre-trained models. Drawing inspiration from the analogy between latent channels and frequency components, we examine domain gaps in LIC and observe that out-of-training-domain images disrupt pre-trained channel-wise decomposition. Consequently, we introduce a method for channel-wise re-allocation using convolution-based adapters and low-rank adapters, which are lightweight and compatible to mainstream LIC schemes. Extensive experiments across multiple domains and multiple representative LIC schemes demonstrate that our method significantly enhances pre-trained models, achieving comparable performance to H.266/VVC intra coding with merely 25 target-domain samples. Additionally, our method matches the performance of full-model finetune while transmitting fewer than $2\%$ of the parameters.
