ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization

Hao Cao; Chengbin Liang; Wenqi Guo; Zhijin Qin; Jungong Han

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization

Hao Cao, Chengbin Liang, Wenqi Guo, Zhijin Qin, Jungong Han

TL;DR

ProGIC is a compact codec built on residual vector quantization (RVQ), a compact codec built on depthwise-separable convolutions and small attention blocks, enabling practical deployment on both GPUs and CPU-only devices.

Abstract

Recent advances in generative image compression (GIC) have delivered remarkable improvements in perceptual quality. However, many GICs rely on large-scale and rigid models, which severely constrain their utility for flexible transmission and practical deployment in low-bitrate scenarios. To address these issues, we propose Progressive Generative Image Compression (ProGIC), a compact codec built on residual vector quantization (RVQ). In RVQ, a sequence of vector quantizers encodes the residuals stage by stage, each with its own codebook. The resulting codewords sum to a coarse-to-fine reconstruction and a progressive bitstream, enabling previews from partial data. We pair this with a lightweight backbone based on depthwise-separable convolutions and small attention blocks, enabling practical deployment on both GPUs and CPU-only devices. Experimental results show that ProGIC attains comparable compression performance compared with previous methods. It achieves bitrate savings of up to 57.57% on DISTS and 58.83% on LPIPS compared to MS-ILLM on the Kodak dataset. Beyond perceptual quality, ProGIC enables progressive transmission for flexibility, and also delivers over 10 times faster encoding and decoding compared with MS-ILLM on GPUs for efficiency.

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization

TL;DR

Abstract

Paper Structure (42 sections, 8 equations, 23 figures, 6 tables)

This paper contains 42 sections, 8 equations, 23 figures, 6 tables.

Introduction
Related Works
Learned Image Compression
Non-Generative Codecs.
Generative Codecs.
Vector Quantization in Latent Space
Methods
Overall Architecture with RVQ
Lightweight Backbone with Attention and Feature Modulation
Learning Strategy for Progressive Decoding
Experiments
Experimental Setup
Implementation Details.
Training Details.
Evaluation Datasets and Metrics.
...and 27 more sections

Figures (23)

Figure 1: BD-rate vs. Decoding Latency on the Kodak dataset measured with DISTS on one NVIDIA A100 GPU. The proposed ProGIC attains competitive BD-rate while substantially reducing decoding latency. Upper-left indicates better.
Figure 2: Conceptual illustration of the motivation behind ProGIC. The original image vector is approximated by a base vector plus a sequence of residual vectors, yielding progressively improved reconstructions.
Figure 3: (a) Overview of the proposed ProGIC. Each down-/up-sampling stage consists of a stack of $M$ depthwise convolution blocks and a feed-forward network (FFN). The blocks in $g_s(\cdot)$ are modified with feature modulation, as described in \ref{['sec:3.2']}. (b) Depthwise convolution block. “Depth conv” denotes a depthwise convolution, while others are pointwise convolutions. (c) FFN architecture, where “Chunk-2” splits the tensor into two equal parts along the channel dimension.
Figure 4: Feature modulation in an FFN: at each progressive decoding stage, stage-specific scale and bias are applied to the features before the residual addition.
Figure 5: Rate-distortion performance on the Kodak, Tecnick, DIV2K, and CLIC2020-Professional datasets, evaluated with LPIPS and DISTS vs. BPP. Curves closer to the lower-left are better, indicating better quality at the same compression ratio. "OOM" denotes out-of-memory under the official evaluation environment.
...and 18 more figures

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization

TL;DR

Abstract

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization

Authors

TL;DR

Abstract

Table of Contents

Figures (23)