Table of Contents
Fetching ...

Adaptive Learned Image Compression with Graph Neural Networks

Yunuo Chen, Bing He, Zezheng Lyu, Hongwei Hu, Qunshan Gu, Yuan Tian, Guo Lu

Abstract

Efficient image compression relies on modeling both local and global redundancy. Most state-of-the-art (SOTA) learned image compression (LIC) methods are based on CNNs or Transformers, which are inherently rigid. Standard CNN kernels and window-based attention mechanisms impose fixed receptive fields and static connectivity patterns, which potentially couple non-redundant pixels simply due to their proximity in Euclidean space. This rigidity limits the model's ability to adaptively capture spatially varying redundancy across the image, particularly at the global level. To overcome these limitations, we propose a content-adaptive image compression framework based on Graph Neural Networks (GNNs). Specifically, our approach constructs dual-scale graphs that enable flexible, data-driven receptive fields. Furthermore, we introduce adaptive connectivity by dynamically adjusting the number of neighbors for each node based on local content complexity. These innovations empower our Graph-based Learned Image Compression (GLIC) model to effectively model diverse redundancy patterns across images, leading to more efficient and adaptive compression. Experiments demonstrate that GLIC achieves state-of-the-art performance, achieving BD-rate reductions of 19.29%, 21.69%, and 18.71% relative to VTM-9.1 on Kodak, Tecnick, and CLIC, respectively. Code will be released at https://github.com/UnoC-727/GLIC.

Adaptive Learned Image Compression with Graph Neural Networks

Abstract

Efficient image compression relies on modeling both local and global redundancy. Most state-of-the-art (SOTA) learned image compression (LIC) methods are based on CNNs or Transformers, which are inherently rigid. Standard CNN kernels and window-based attention mechanisms impose fixed receptive fields and static connectivity patterns, which potentially couple non-redundant pixels simply due to their proximity in Euclidean space. This rigidity limits the model's ability to adaptively capture spatially varying redundancy across the image, particularly at the global level. To overcome these limitations, we propose a content-adaptive image compression framework based on Graph Neural Networks (GNNs). Specifically, our approach constructs dual-scale graphs that enable flexible, data-driven receptive fields. Furthermore, we introduce adaptive connectivity by dynamically adjusting the number of neighbors for each node based on local content complexity. These innovations empower our Graph-based Learned Image Compression (GLIC) model to effectively model diverse redundancy patterns across images, leading to more efficient and adaptive compression. Experiments demonstrate that GLIC achieves state-of-the-art performance, achieving BD-rate reductions of 19.29%, 21.69%, and 18.71% relative to VTM-9.1 on Kodak, Tecnick, and CLIC, respectively. Code will be released at https://github.com/UnoC-727/GLIC.

Paper Structure

This paper contains 25 sections, 12 equations, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) While standard convolution and window attention-based methods are constrained by a rigid local window and a fixed connectivity pattern, our GNN-based interactions enable local and global adaptive connections across the image. (b) Rate-Distortion performance and efficiency comparisons on the Tecnick dataset. Upper left is better.
  • Figure 2: Overview of our method. (a) Architecture of the proposed GLIC codec. Channel widths are $C_1,C_2,C_3,C_4$, and the numbers of non-linear transform blocks are $L_1,L_2,L_3$. (b) Graph-based Feature Aggregation Block used as advanced non-linear transforms. (c) Lightweight Conv Block for early feature extraction and reduced complexity.
  • Figure 3: PSNR R-D curves on the CLIC 2020 dataset.
  • Figure 4: PSNR R-D curves on the Tecnick dataset.
  • Figure 5: PSNR and MS-SSIM R-D curves on the Kodak Dataset.
  • ...and 3 more figures