Table of Contents
Fetching ...

DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

Jooyoung Lee, Se Yoon Jeong, Munchurl Kim

TL;DR

Progressive image coding aims to support multiple quality levels within a single bitstream, but prior learned PIC methods rely on handcrafted hierarchies or multiple subnetworks. DeepHQ introduces an eight-layer learned hierarchical quantizer with per-layer step sizes and a selective encoding mechanism, enabling fine-grained progressive reconstruction while reducing model size and decoding time. By integrating boundary-aware hierarchical quantization, PMF-based entropy coding, and component-wise selective coding, it achieves substantial rate savings and faster decoding compared with state-of-the-art PIC methods, using a single base model across bitrates. The approach demonstrates practical impact for scalable image delivery and opens avenues for jointly training the full hierarchical quantization framework.

Abstract

Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes to the transformed latent representations in a hierarchical manner. These approaches are designed to compress only the progressively added information as the quality improves, considering that a wider quantization interval for lower-quality compression includes multiple narrower sub-intervals for higher-quality compression. However, the existing methods are based on handcrafted quantization hierarchies, resulting in sub-optimal compression efficiency. In this paper, we propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer. We also incorporate selective compression with which only the essential representation components are compressed for each quantization layer. We demonstrate that our method achieves significantly higher coding efficiency than the existing approaches with decreased decoding time and reduced model size. The source code is publicly available at https://github.com/JooyoungLeeETRI/DeepHQ

DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

TL;DR

Progressive image coding aims to support multiple quality levels within a single bitstream, but prior learned PIC methods rely on handcrafted hierarchies or multiple subnetworks. DeepHQ introduces an eight-layer learned hierarchical quantizer with per-layer step sizes and a selective encoding mechanism, enabling fine-grained progressive reconstruction while reducing model size and decoding time. By integrating boundary-aware hierarchical quantization, PMF-based entropy coding, and component-wise selective coding, it achieves substantial rate savings and faster decoding compared with state-of-the-art PIC methods, using a single base model across bitrates. The approach demonstrates practical impact for scalable image delivery and opens avenues for jointly training the full hierarchical quantization framework.

Abstract

Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes to the transformed latent representations in a hierarchical manner. These approaches are designed to compress only the progressively added information as the quality improves, considering that a wider quantization interval for lower-quality compression includes multiple narrower sub-intervals for higher-quality compression. However, the existing methods are based on handcrafted quantization hierarchies, resulting in sub-optimal compression efficiency. In this paper, we propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer. We also incorporate selective compression with which only the essential representation components are compressed for each quantization layer. We demonstrate that our method achieves significantly higher coding efficiency than the existing approaches with decreased decoding time and reduced model size. The source code is publicly available at https://github.com/JooyoungLeeETRI/DeepHQ
Paper Structure (16 sections, 17 equations, 20 figures, 1 table)

This paper contains 16 sections, 17 equations, 20 figures, 1 table.

Figures (20)

  • Figure 1: Illustrations of (a) fixed-rate image coding models, (b) a variable-rate image coding model, and (c) a progressive image coding (PIC) model.
  • Figure 1: Model sizes, average rate savings against the BPG BPG codec, and average decoding times of various models.
  • Figure 2: Illustrations of (a) a recurrent residual PIC scheme in the pixel domain and (b) a hierarchical quantization-based PIC scheme in latent space (feature domain).
  • Figure 3: Illustration of the existing handcrafted hierarchical quantization process.
  • Figure 4: (a) Overall encoding procedure of DeepHQ. Only essential representation components of a representation $\bm y^*$ are selected for each quantization layer, and then the selected representation components are hierarchically quantized and entropy-coded utilizing quantization step sizes learned for each quantization layer. (b) Overall decoding procedure of DeepHQ. The hierarchical dequantization process and the reshaping of restored representation components are conducted in response to the operation of the encoder. The detailed operation flowcharts of the two key elements, $Q$ and $DQ$, highlighted with bold boxes, are provided in Fig. \ref{['fig:q_and_dq']}. Encoder, Decoder, hyper-encoder, and hyper-decoder networks are denoted as $En$, $De$, $HE$, and $HD$, respectively. Representation selection mask $m(\bm{\hat{z}}, l)$ in Eq. \ref{['eq:selection_process']} is abbreviated as $m_l$. Note that the compression and decompression processes for hyperprior representation $\bm {\hat{z}}$ are omitted for briefness, for which we adopt the Hyperprior model Balle18.
  • ...and 15 more figures