StableCodec: Taming One-Step Diffusion for Extreme Image Compression

Tianyu Zhang; Xin Luo; Li Li; Dong Liu

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

Tianyu Zhang, Xin Luo, Li Li, Dong Liu

TL;DR

StableCodec tackles extreme image compression by fusing one-step diffusion with a Deep Compression Latent Codec and a Dual-Branch Coding Structure, enabling high realism and fidelity at ultra-low bitrates. It introduces end-to-end optimization with two-stage implicit bitrate pruning, leveraging perceptual and semantic losses to guide reconstruction under stringent bitrate constraints. Empirically, it achieves state-of-the-art FID, KID, and DISTS on CLIC 2020, DIV2K, and Kodak at bitrates as low as 0.005 bpp, while maintaining competitive inference speed and memory footprint. This approach broadens the practicality of diffusion-based codecs for real-time applications and extreme compression scenarios.

Abstract

Diffusion-based image compression has shown remarkable potential for achieving ultra-low bitrate coding (less than 0.05 bits per pixel) with high realism, by leveraging the generative priors of large pre-trained text-to-image diffusion models. However, current approaches require a large number of denoising steps at the decoder to generate realistic results under extreme bitrate constraints, limiting their application in real-time compression scenarios. Additionally, these methods often sacrifice reconstruction fidelity, as diffusion models typically fail to guarantee pixel-level consistency. To address these challenges, we introduce StableCodec, which enables one-step diffusion for high-fidelity and high-realism extreme image compression with improved coding efficiency. To achieve ultra-low bitrates, we first develop an efficient Deep Compression Latent Codec to transmit a noisy latent representation for a single-step denoising process. We then propose a Dual-Branch Coding Structure, consisting of a pair of auxiliary encoder and decoder, to enhance reconstruction fidelity. Furthermore, we adopt end-to-end optimization with joint bitrate and pixel-level constraints. Extensive experiments on the CLIC 2020, DIV2K, and Kodak dataset demonstrate that StableCodec outperforms existing methods in terms of FID, KID and DISTS by a significant margin, even at bitrates as low as 0.005 bits per pixel, while maintaining strong fidelity. Additionally, StableCodec achieves inference speeds comparable to mainstream transform coding schemes. All source code are available at https://github.com/LuizScarlet/StableCodec.

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

TL;DR

Abstract

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)