Table of Contents
Fetching ...

Unicorn: Unified Neural Image Compression with One Number Reconstruction

Qi Zheng, Haozhi Wang, Zihao Liu, Jiaming Liu, Peiye Liu, Zhijian Hao, Yanheng Lu, Dimin Niu, Jinjia Zhou, Minge Jing, Yibo Fan

TL;DR

Unicorn reframes image compression by treating a set of images as index-image pairs and training a single unified decoder to reconstruct images from random noise conditioned on an index. The Laduree prototype, based on a conditional latent diffusion model, achieves significant bitrate savings over both explicit and implicit schemes and exhibits increasing compression gains as the number of images grows, owing to the elimination of inter-image redundancy. The work demonstrates a viable pathway toward scalable large-scale image storage with a lightweight online cost and explores design choices in conditioning, latent normalization, and weight quantization to push practical performance further.

Abstract

Prevalent lossy image compression schemes can be divided into: 1) explicit image compression (EIC), including traditional standards and neural end-to-end algorithms; 2) implicit image compression (IIC) based on implicit neural representations (INR). The former is encountering impasses of either leveling off bitrate reduction at a cost of tremendous complexity while the latter suffers from excessive smoothing quality as well as lengthy decoder models. In this paper, we propose an innovative paradigm, which we dub \textbf{Unicorn} (\textbf{U}nified \textbf{N}eural \textbf{I}mage \textbf{C}ompression with \textbf{O}ne \textbf{N}number \textbf{R}econstruction). By conceptualizing the images as index-image pairs and learning the inherent distribution of pairs in a subtle neural network model, Unicorn can reconstruct a visually pleasing image from a randomly generated noise with only one index number. The neural model serves as the unified decoder of images while the noises and indexes corresponds to explicit representations. As a proof of concept, we propose an effective and efficient prototype of Unicorn based on latent diffusion models with tailored model designs. Quantitive and qualitative experimental results demonstrate that our prototype achieves significant bitrates reduction compared with EIC and IIC algorithms. More impressively, benefitting from the unified decoder, our compression ratio escalates as the quantity of images increases. We envision that more advanced model designs will endow Unicorn with greater potential in image compression. We will release our codes in \url{https://github.com/uniqzheng/Unicorn-Laduree}.

Unicorn: Unified Neural Image Compression with One Number Reconstruction

TL;DR

Unicorn reframes image compression by treating a set of images as index-image pairs and training a single unified decoder to reconstruct images from random noise conditioned on an index. The Laduree prototype, based on a conditional latent diffusion model, achieves significant bitrate savings over both explicit and implicit schemes and exhibits increasing compression gains as the number of images grows, owing to the elimination of inter-image redundancy. The work demonstrates a viable pathway toward scalable large-scale image storage with a lightweight online cost and explores design choices in conditioning, latent normalization, and weight quantization to push practical performance further.

Abstract

Prevalent lossy image compression schemes can be divided into: 1) explicit image compression (EIC), including traditional standards and neural end-to-end algorithms; 2) implicit image compression (IIC) based on implicit neural representations (INR). The former is encountering impasses of either leveling off bitrate reduction at a cost of tremendous complexity while the latter suffers from excessive smoothing quality as well as lengthy decoder models. In this paper, we propose an innovative paradigm, which we dub \textbf{Unicorn} (\textbf{U}nified \textbf{N}eural \textbf{I}mage \textbf{C}ompression with \textbf{O}ne \textbf{N}number \textbf{R}econstruction). By conceptualizing the images as index-image pairs and learning the inherent distribution of pairs in a subtle neural network model, Unicorn can reconstruct a visually pleasing image from a randomly generated noise with only one index number. The neural model serves as the unified decoder of images while the noises and indexes corresponds to explicit representations. As a proof of concept, we propose an effective and efficient prototype of Unicorn based on latent diffusion models with tailored model designs. Quantitive and qualitative experimental results demonstrate that our prototype achieves significant bitrates reduction compared with EIC and IIC algorithms. More impressively, benefitting from the unified decoder, our compression ratio escalates as the quantity of images increases. We envision that more advanced model designs will endow Unicorn with greater potential in image compression. We will release our codes in \url{https://github.com/uniqzheng/Unicorn-Laduree}.

Paper Structure

This paper contains 32 sections, 4 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Bitrates comparison among EIC, IIC, and Unicorn when compressing $4000$ images at the high perceptual quality ($\text{LPIPS} = 0.10$ for EIC and Uncorn while $0.35$ for IIC since it's hard to approach satisfactory perceptual quality.
  • Figure 2: Overall framework of the proposed paradigm Unicorn specified by the proposed prototype Laduree.
  • Figure 3: Design space explorations on various manners for index embedding and condition within Transformer-based denoising model, with parameters introduced by each manner compared in the right bottom.
  • Figure 4: RD curves of evaluated image compression models on CAT (top row) and HYBRID (bottom row) when compressing $4000$ images.
  • Figure 5: RD performance comparison in terms of PSNR.
  • ...and 4 more figures