Table of Contents
Fetching ...

TreeNet: A Light Weight Model for Low Bitrate Image Compression

Mahadev Prasad Panda, Purnachandra Rao Makkena, Srivatsa Prativadibhayankaram, Siegfried Fößel, André Kaup

TL;DR

TreeNet tackles high computational cost in learned image compression by introducing a binary-tree encoder–decoder with attentional fusion, achieving strong rate–distortion performance at low bitrates with substantially reduced complexity. The architecture yields four latent streams processed by separate entropy bottlenecks and a constrained, low-channel-count decoder, enabling efficient parallelizable computation. Comprehensive experiments on Kodak, CLIC, and Tecnick show competitive RD performance and a notable 87.8% reduction in complexity versus JPEG AI, supported by rigorous ablations that reveal latent-block contributions and spatial rate distribution. Overall, TreeNet offers a practical, interpretable, low-cost path toward scalable learning-based image compression.

Abstract

Reducing computational complexity remains a critical challenge for the widespread adoption of learning-based image compression techniques. In this work, we propose TreeNet, a novel low-complexity image compression model that leverages a binary tree-structured encoder-decoder architecture to achieve efficient representation and reconstruction. We employ attentional feature fusion mechanism to effectively integrate features from multiple branches. We evaluate TreeNet on three widely used benchmark datasets and compare its performance against competing methods including JPEG AI, a recent standard in learning-based image compression. At low bitrates, TreeNet achieves an average improvement of 4.83% in BD-rate over JPEG AI, while reducing model complexity by 87.82%. Furthermore, we conduct extensive ablation studies to investigate the influence of various latent representations within TreeNet, offering deeper insights into the factors contributing to reconstruction.

TreeNet: A Light Weight Model for Low Bitrate Image Compression

TL;DR

TreeNet tackles high computational cost in learned image compression by introducing a binary-tree encoder–decoder with attentional fusion, achieving strong rate–distortion performance at low bitrates with substantially reduced complexity. The architecture yields four latent streams processed by separate entropy bottlenecks and a constrained, low-channel-count decoder, enabling efficient parallelizable computation. Comprehensive experiments on Kodak, CLIC, and Tecnick show competitive RD performance and a notable 87.8% reduction in complexity versus JPEG AI, supported by rigorous ablations that reveal latent-block contributions and spatial rate distribution. Overall, TreeNet offers a practical, interpretable, low-cost path toward scalable learning-based image compression.

Abstract

Reducing computational complexity remains a critical challenge for the widespread adoption of learning-based image compression techniques. In this work, we propose TreeNet, a novel low-complexity image compression model that leverages a binary tree-structured encoder-decoder architecture to achieve efficient representation and reconstruction. We employ attentional feature fusion mechanism to effectively integrate features from multiple branches. We evaluate TreeNet on three widely used benchmark datasets and compare its performance against competing methods including JPEG AI, a recent standard in learning-based image compression. At low bitrates, TreeNet achieves an average improvement of 4.83% in BD-rate over JPEG AI, while reducing model complexity by 87.82%. Furthermore, we conduct extensive ablation studies to investigate the influence of various latent representations within TreeNet, offering deeper insights into the factors contributing to reconstruction.

Paper Structure

This paper contains 13 sections, 7 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Top: BD-rate for PSNR on Tecnick dataset asuni2014testimages. TreeNet achieves better trade-off between coding performance and complexity. Bottom: Comparison of decoded patches on kodim06 image from Kodak datasetkodak1993.
  • Figure 2: Schematic diagram for TreeNet. A binary tree-based analysis transform $g_a$ maps image $x$ into four latents $y_1, y_2, y_3,$ and $y_4$. These latents are independently quantized and entropy coded using four entropy bottleneck blocks $g_{eb}^1, g_{eb}^2, g_{eb}^3$, and $g_{eb}^4$. The synthesis transform $g_s$ maps latents to image space producing reconstructed image $\hat{x}$. The attentional feature fusion is as described in dai21aff.
  • Figure 3: RD curves on Kodak datasetkodak1993.
  • Figure 4: RD curves on CLIC Professional Valid datasettoderici2020workshop.
  • Figure 5: RD curves on Tecnick datasetasuni2014testimages.
  • ...and 3 more figures