Table of Contents
Fetching ...

PSC: Posterior Sampling-Based Compression

Noam Elata, Tomer Michaeli, Michael Elad

TL;DR

PSC tackles flexible, training-free image compression by constructing an image-specific transform ${\bm{H}}$ in an adaptive, progressive manner. It leverages a pre-trained diffusion model as the sole neural component and uses a diffusion-based posterior sampler to select transform rows, enabling zero-shot compression without transmitting ${\bm{H}}$ while encoding measurements ${\mathbf{y}} = {\bm{H}}{\mathbf{x}}$ through quantization $Q(\cdot)$. The method achieves competitive rate-distortion and perceptual quality across rates, including a latent-PSC variant that operates in the diffusion latent space conditioned on text prompts. While computationally intensive and currently using simplified quantization, PSC offers a scalable framework that improves in tandem with advances in diffusion/posterior sampling.

Abstract

Diffusion models have transformed the landscape of image generation and now show remarkable potential for image compression. Most of the recent diffusion-based compression methods require training and are tailored for a specific bit-rate. In this work, we propose Posterior Sampling-based Compression (PSC) - a zero-shot compression method that leverages a pre-trained diffusion model as its sole neural network component, thus enabling the use of diverse, publicly available models without additional training. Our approach is inspired by transform coding methods, which encode the image in some pre-chosen transform domain. However, PSC constructs a transform that is adaptive to the image. This is done by employing a zero-shot diffusion-based posterior sampler so as to progressively construct the rows of the transform matrix. Each new chunk of rows is chosen to reduce the uncertainty about the image given the quantized measurements collected thus far. Importantly, the same adaptive scheme can be replicated at the decoder, thus avoiding the need to encode the transform itself. We demonstrate that even with basic quantization and entropy coding, PSC's performance is comparable to established training-based methods in terms of rate, distortion, and perceptual quality. This is while providing greater flexibility, allowing to choose at inference time any desired rate or distortion.

PSC: Posterior Sampling-Based Compression

TL;DR

PSC tackles flexible, training-free image compression by constructing an image-specific transform in an adaptive, progressive manner. It leverages a pre-trained diffusion model as the sole neural component and uses a diffusion-based posterior sampler to select transform rows, enabling zero-shot compression without transmitting while encoding measurements through quantization . The method achieves competitive rate-distortion and perceptual quality across rates, including a latent-PSC variant that operates in the diffusion latent space conditioned on text prompts. While computationally intensive and currently using simplified quantization, PSC offers a scalable framework that improves in tandem with advances in diffusion/posterior sampling.

Abstract

Diffusion models have transformed the landscape of image generation and now show remarkable potential for image compression. Most of the recent diffusion-based compression methods require training and are tailored for a specific bit-rate. In this work, we propose Posterior Sampling-based Compression (PSC) - a zero-shot compression method that leverages a pre-trained diffusion model as its sole neural network component, thus enabling the use of diverse, publicly available models without additional training. Our approach is inspired by transform coding methods, which encode the image in some pre-chosen transform domain. However, PSC constructs a transform that is adaptive to the image. This is done by employing a zero-shot diffusion-based posterior sampler so as to progressively construct the rows of the transform matrix. Each new chunk of rows is chosen to reduce the uncertainty about the image given the quantized measurements collected thus far. Importantly, the same adaptive scheme can be replicated at the decoder, thus avoiding the need to encode the transform itself. We demonstrate that even with basic quantization and entropy coding, PSC's performance is comparable to established training-based methods in terms of rate, distortion, and perceptual quality. This is while providing greater flexibility, allowing to choose at inference time any desired rate or distortion.
Paper Structure (16 sections, 12 figures, 3 algorithms)

This paper contains 16 sections, 12 figures, 3 algorithms.

Figures (12)

  • Figure 1: Images compressed with latent-PSC at low bit-rates. Latent-PSC leverages pre-trained diffusion models to deliver high perceptual quality at any compression rate, indicated by the bits-per-pixel (BPP) for each decompressed result. Top: Example at ultra-low rates demonstrates PSC maintains high image quality while preserving the original image composition. Bottom: Example at low rates with zoomed detail highlights PSC's capacity to maintain fine details, adapting seamlessly to any compression rate without training.
  • Figure 2: A diagram of how PSC zeros in on the input image by progressively tightening the posterior. The diagram shows a single step, where the new rows and matching measurements are computed based on the direction of largest uncertainty in the posterior distribution. The posterior mean is shown to visualize the information captured by previous iterations.
  • Figure 3: PSC diagram: Both encoder and decoder construct an image-specific transform ${\bm{H}}$ through an adaptive compressed sensing algorithm, progressively adding rows based on posterior sample covariance. The transmission of quantized measurements ${\mathbf{y}}$ ensures identical inputs at each progressive step, while a shared random seed guarantees deterministic outputs on both sides. Together, these factors enable the construction of identical transforms on both sides -- eliminating the need to transmit the transform as side information.
  • Figure 4: Qualitative examples for compression with PSC, compared to other compression algorithms. BPP and PSNR are reported per example. Our method can be used for both low-distortion or high perceptual quality using the same compressed representation.
  • Figure 5: Rate-Distortion (left) and Rate-Perception (right) curves for ImageNet256 compression. Distortion is measured as average PSNR of images for the same desired rate or specified compression quality, while Perception (photorealism) is measured by FID.
  • ...and 7 more figures