Table of Contents
Fetching ...

ProgDTD: Progressive Learned Image Compression with Double-Tail-Drop Training

Ali Hojjat, Janek Haberer, Olaf Landsiedel

TL;DR

ProgDTD addresses the lack of progression in learned image compression by introducing a training-based method that sorts bottleneck information by importance. It leverages double-tail-drop to induce a progressively decodable representation without adding model parameters, applied here to the Ballé hyperprior framework. Results show ProgDTD achieves MS-SSIM and accuracy comparable to non-progressive baselines and competitive progressive models, with controllable progression via a user-specified range. This approach enables practical progressive decoding for CNN-based learned compression, offering a flexible, parameter-free means to adapt bitrate and decoding latency to network conditions.

Abstract

Progressive compression allows images to start loading as low-resolution versions, becoming clearer as more data is received. This increases user experience when, for example, network connections are slow. Today, most approaches for image compression, both classical and learned ones, are designed to be non-progressive. This paper introduces ProgDTD, a training method that transforms learned, non-progressive image compression approaches into progressive ones. The design of ProgDTD is based on the observation that the information stored within the bottleneck of a compression model commonly varies in importance. To create a progressive compression model, ProgDTD modifies the training steps to enforce the model to store the data in the bottleneck sorted by priority. We achieve progressive compression by transmitting the data in order of its sorted index. ProgDTD is designed for CNN-based learned image compression models, does not need additional parameters, and has a customizable range of progressiveness. For evaluation, we apply ProgDTDto the hyperprior model, one of the most common structures in learned image compression. Our experimental results show that ProgDTD performs comparably to its non-progressive counterparts and other state-of-the-art progressive models in terms of MS-SSIM and accuracy.

ProgDTD: Progressive Learned Image Compression with Double-Tail-Drop Training

TL;DR

ProgDTD addresses the lack of progression in learned image compression by introducing a training-based method that sorts bottleneck information by importance. It leverages double-tail-drop to induce a progressively decodable representation without adding model parameters, applied here to the Ballé hyperprior framework. Results show ProgDTD achieves MS-SSIM and accuracy comparable to non-progressive baselines and competitive progressive models, with controllable progression via a user-specified range. This approach enables practical progressive decoding for CNN-based learned compression, offering a flexible, parameter-free means to adapt bitrate and decoding latency to network conditions.

Abstract

Progressive compression allows images to start loading as low-resolution versions, becoming clearer as more data is received. This increases user experience when, for example, network connections are slow. Today, most approaches for image compression, both classical and learned ones, are designed to be non-progressive. This paper introduces ProgDTD, a training method that transforms learned, non-progressive image compression approaches into progressive ones. The design of ProgDTD is based on the observation that the information stored within the bottleneck of a compression model commonly varies in importance. To create a progressive compression model, ProgDTD modifies the training steps to enforce the model to store the data in the bottleneck sorted by priority. We achieve progressive compression by transmitting the data in order of its sorted index. ProgDTD is designed for CNN-based learned image compression models, does not need additional parameters, and has a customizable range of progressiveness. For evaluation, we apply ProgDTDto the hyperprior model, one of the most common structures in learned image compression. Our experimental results show that ProgDTD performs comparably to its non-progressive counterparts and other state-of-the-art progressive models in terms of MS-SSIM and accuracy.
Paper Structure (21 sections, 9 equations, 6 figures, 1 algorithm)

This paper contains 21 sections, 9 equations, 6 figures, 1 algorithm.

Figures (6)

  • Figure 1: Qualitative comparison of reconstructed images from: ProgDTD (trained with $\lambda=0.01$), non-progressive-Ballé (trained with $\lambda=0.0001, 0.001,0.005, 0.01$) and standard-Ballé (trained with $\lambda=0.01$). ProgDTD and non-progressive-Ballé reconstruct images of similar quality; meanwhile, removing only a few bits in standard-Ballé leads to a significant degradation in quality.
  • Figure 2: Tail-Drop Training: This image shows an example of the information distribution in the bottleneck $\mathcal{B}[8,M,M]$. If we train the model with the tail-drop, the training procedure puts the data in order of their importance. The colors show the importance of each filter.
  • Figure 3: The network architecture of the hyperprior model balle2018variational with double-tail-drop. Quantization is represented by Q, and arithmetic encoding and decoding are denoted by AE and AD, respectively. With the bitrate controller, we can determine the bitrate that we want to have.
  • Figure 4: RD performance (MS-SSIM and PSNR) comparison of the proposed ProgDTD (trained with $\lambda=0.01, 0.05$ and also with $\mathcal{U}(0, 1)$ and $\mathcal{U}(0.3, 1)$ as the random number generator), non-progressive-Ballé (trained with $\lambda=0.0001, 0.001, 0.005, 0.01, 0.05$) and standard-Ballé (trained with $\lambda=0.01, 0.05$), DPICT (2022)lee2022dpict, JPEG2000 skodras2001jpeg, Johnston (2018) et al.johnston2018improved and Torderici (2015) et al.toderici2015variable on the KODAK dataset. Note that the non-progressive-Ballé (dashed line) is a collection of Ballé models which have been trained with different $\lambda$, and it is not progressive.
  • Figure 5: RD performance (MS-SSIM and PSNR) comparison of the proposed ProgDTD (trained with $\lambda=0.01, 0.05, 0.1, 1.0$ and also with $\mathcal{U}(0, 1)$ and $\mathcal{U}(0.3, 1)$ as the random number generator) on the KODAK dataset.
  • ...and 1 more figures