Table of Contents
Fetching ...

GABIC: Graph-based Attention Block for Image Compression

Gabriele Spadaro, Alberto Presta, Enzo Tartaglione, Jhony H. Giraldo, Marco Grangetto, Attilio Fiandrotti

TL;DR

The paper addresses redundancy in attention mechanisms used by Learned Image Compression (LIC) and proposes Graph-based Attention Block for Image Compression (GABIC), which constrains attention to a local $k$-NN graph within each window. GABIC updates patch embeddings via a graph convolution with attention coefficients $\alpha_{i,j}$ computed from projected features, reducing redundant feature aggregation; the block is integrated into a hyperprior LIC framework with a channel-wise entropy model. On Kodak and CLIC benchmarks, GABIC achieves BD-Rate gains of about $1.50\%$ and $0.89\%$, respectively, with the strongest improvements at high bitrates and better preservation of high-frequency details; the approach maintains comparable complexity to window-based attention methods. The work demonstrates a promising direction for LIC by fusing graph attention with local window processing and provides code and trained models for reproducibility.

Abstract

While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned Image Compression (LIC) codecs represent a promising alternative. In detail, integrating attention mechanisms from Vision Transformers into LIC models has shown improved compression efficiency. However, extra efficiency often comes at the cost of aggregating redundant features. This work proposes a Graph-based Attention Block for Image Compression (GABIC), a method to reduce feature redundancy based on a k-Nearest Neighbors enhanced attention mechanism. Our experiments show that GABIC outperforms comparable methods, particularly at high bit rates, enhancing compression performance.

GABIC: Graph-based Attention Block for Image Compression

TL;DR

The paper addresses redundancy in attention mechanisms used by Learned Image Compression (LIC) and proposes Graph-based Attention Block for Image Compression (GABIC), which constrains attention to a local -NN graph within each window. GABIC updates patch embeddings via a graph convolution with attention coefficients computed from projected features, reducing redundant feature aggregation; the block is integrated into a hyperprior LIC framework with a channel-wise entropy model. On Kodak and CLIC benchmarks, GABIC achieves BD-Rate gains of about and , respectively, with the strongest improvements at high bitrates and better preservation of high-frequency details; the approach maintains comparable complexity to window-based attention methods. The work demonstrates a promising direction for LIC by fusing graph attention with local window processing and provides code and trained models for reproducibility.

Abstract

While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned Image Compression (LIC) codecs represent a promising alternative. In detail, integrating attention mechanisms from Vision Transformers into LIC models has shown improved compression efficiency. However, extra efficiency often comes at the cost of aggregating redundant features. This work proposes a Graph-based Attention Block for Image Compression (GABIC), a method to reduce feature redundancy based on a k-Nearest Neighbors enhanced attention mechanism. Our experiments show that GABIC outperforms comparable methods, particularly at high bit rates, enhancing compression performance.
Paper Structure (15 sections, 12 equations, 5 figures, 1 table)

This paper contains 15 sections, 12 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Dividing the image into patches, it is possible to compress their content using a GNN. This allows better preserving of local details due to the dynamic graph structure on which the attention is computed, enabling higher compressibility at high bit rates.
  • Figure 2: Overview on GABIC's architecture (a) and detail of the proposed Graph-based Window Attention Module (b).
  • Figure 3: Traditional local window block scheme (a) and our proposed Local Graph-based Window block (b).
  • Figure 4: Rate Distortion plots for the Kodak (a) and CLIC (b) datasets.
  • Figure 5: Original images (a, d), bits allocation for low bitrates (b, e) and for high bitrates (c, f), where red indicates more bits for GABIC and blue means less for GABIC compared to Zou2022 zou2022devil.