Table of Contents
Fetching ...

Efficient Progressive Image Compression with Variance-aware Masking

Alberto Presta, Enzo Tartaglione, Attilio Fiandrotti, Marco Grangetto, Pamela Cosman

TL;DR

The paper addresses scalable, progressive image compression by introducing a two-latent scheme with a base latent $y^{b}$ for the lowest quality and a top latent $y^{t}$ that supports higher qualities, with a residual $r^{t}=y^{t}-y^{b}$ used to progressively refine reconstructions. It proposes a lightweight variance-aware masking policy that partitions $r^{t}$ into complementary components for transmission at different quality levels, along with a Progressive Channel-wise Entropy Estimation Module (PCEEM) and Rate Enhancement Modules (REMs) to improve entropy estimation without adding new parameters. Key contributions include the base/top latent framework, the nonparametric masking strategy, the PCEEM architecture, and REMs that refine entropy parameters across quality checkpoints, yielding competitive rate-distortion performance while reducing decoding time and parameter counts. The approach enables efficient, scalable progressive decoding suitable for networks with fluctuating capacity and real-time constraints, with practical impact on streaming and adaptive image compression pipelines. $y^{b}$, $y^{t}$, and $r^{t}$ serve as the core constructs enabling quality-guided bitstreams, while $q ightarrow[0,100]$ governs the progressive reconstruction, all integrated through hyperprior modeling and channel-wise entropy estimation.

Abstract

Learned progressive image compression is gaining momentum as it allows improved image reconstruction as more bits are decoded at the receiver. We propose a progressive image compression method in which an image is first represented as a pair of base-quality and top-quality latent representations. Next, a residual latent representation is encoded as the element-wise difference between the top and base representations. Our scheme enables progressive image compression with element-wise granularity by introducing a masking system that ranks each element of the residual latent representation from most to least important, dividing it into complementary components, which can be transmitted separately to the decoder in order to obtain different reconstruction quality. The masking system does not add further parameters nor complexity. At the receiver, any elements of the top latent representation excluded from the transmitted components can be independently replaced with the mean predicted by the hyperprior architecture, ensuring reliable reconstructions at any intermediate quality level. We also introduced Rate Enhancement Modules (REMs), which refine the estimation of entropy parameters using already decoded components. We obtain results competitive with state-of-the-art competitors, while significantly reducing computational complexity, decoding time, and number of parameters.

Efficient Progressive Image Compression with Variance-aware Masking

TL;DR

The paper addresses scalable, progressive image compression by introducing a two-latent scheme with a base latent for the lowest quality and a top latent that supports higher qualities, with a residual used to progressively refine reconstructions. It proposes a lightweight variance-aware masking policy that partitions into complementary components for transmission at different quality levels, along with a Progressive Channel-wise Entropy Estimation Module (PCEEM) and Rate Enhancement Modules (REMs) to improve entropy estimation without adding new parameters. Key contributions include the base/top latent framework, the nonparametric masking strategy, the PCEEM architecture, and REMs that refine entropy parameters across quality checkpoints, yielding competitive rate-distortion performance while reducing decoding time and parameter counts. The approach enables efficient, scalable progressive decoding suitable for networks with fluctuating capacity and real-time constraints, with practical impact on streaming and adaptive image compression pipelines. , , and serve as the core constructs enabling quality-guided bitstreams, while governs the progressive reconstruction, all integrated through hyperprior modeling and channel-wise entropy estimation.

Abstract

Learned progressive image compression is gaining momentum as it allows improved image reconstruction as more bits are decoded at the receiver. We propose a progressive image compression method in which an image is first represented as a pair of base-quality and top-quality latent representations. Next, a residual latent representation is encoded as the element-wise difference between the top and base representations. Our scheme enables progressive image compression with element-wise granularity by introducing a masking system that ranks each element of the residual latent representation from most to least important, dividing it into complementary components, which can be transmitted separately to the decoder in order to obtain different reconstruction quality. The masking system does not add further parameters nor complexity. At the receiver, any elements of the top latent representation excluded from the transmitted components can be independently replaced with the mean predicted by the hyperprior architecture, ensuring reliable reconstructions at any intermediate quality level. We also introduced Rate Enhancement Modules (REMs), which refine the estimation of entropy parameters using already decoded components. We obtain results competitive with state-of-the-art competitors, while significantly reducing computational complexity, decoding time, and number of parameters.

Paper Structure

This paper contains 17 sections, 6 equations, 19 figures, 3 tables, 1 algorithm.

Figures (19)

  • Figure 1: Compression results for three different qualities, which increase across rows. Adding details via the masking system (a) increases the standard deviation in the non-masked latent representation (b) to add details (c) for a better reconstruction (d).
  • Figure 2: Overview of our proposed architecture. Green and red boxes represent encoder and decoder modules, respectively, while blue boxes must be stored at both encoder and decoder. $q$ represents the target quality.
  • Figure 3: Progressive channel wise entropy estimation model ($\mathop{\mathrm{PCEEM}}\nolimits$) during the $i$-th slice, considering a general quality $q$. $\mathbf{\hat{y}}^{b}$ represents the base latent representation already obtained. $||$ represents concatenation along channels, while $\bigodot$ represents element-wise operation, which can be summation (+) or subtraction (-).
  • Figure 4: Blueprint of REM for a fixed checkpoint quality $\bar{q}$ and for the slice $i$.
  • Figure 5: Rate-distortion performance of our method compared with progressive image compression algorithms: Jeon jeon2023context, Lee lee2022dpict, Lu lu2021progressive and JPEG2000 jpeg2k. We tested our method on Kodak (left), JPEG-AI (center), and CLIC validation dataset (right).
  • ...and 14 more figures