Table of Contents
Fetching ...

Adjustable Spatio-Spectral Hyperspectral Image Compression Network

Martin Hermann Paul Fuchs, Behnood Rasti, Begüm Demir

TL;DR

This work tackles the challenge of efficiently compressing hyperspectral images by enabling adjustable balance between spectral and spatial information. It introduces HyCASS, a six-module network consisting of a spectral encoder, a configurable spatial encoder, CR adapter encoders/decoders, and corresponding decoders, combining CNNs and Residual Swin Transformer Blocks to capture both short-range and long-range redundancies. The authors provide extensive experiments on HySpecNet-11k, Berlin-Urban-Gradient, and MLRetSet, including ablations and comparisons against traditional and learning-based baselines, and derive practical guidelines for selecting spectral versus spatial emphasis as a function of CR and spatial resolution. The results demonstrate that spectral compression is advantageous at low CR or low spatial resolution, while spatio-spectral compression with adjustable spatial stages yields substantial gains at medium-to-high CR and higher spatial resolutions, offering a flexible approach for large-scale HSIs. The work also points toward future integration of foundation models as backbones to further enhance robustness and generalization across sensors and conditions.

Abstract

With the rapid growth of hyperspectral data archives in remote sensing (RS), the need for efficient storage has become essential, driving significant attention toward learning-based hyperspectral image (HSI) compression. However, a comprehensive investigation of the individual and joint effects of spectral and spatial compression on learning-based HSI compression has not been thoroughly examined yet. Conducting such an analysis is crucial for understanding how the exploitation of spectral, spatial, and joint spatio-spectral redundancies affects HSI compression. To address this issue, we propose Adjustable Spatio-Spectral Hyperspectral Image Compression Network (HyCASS), a learning-based model designed for adjustable HSI compression in both spectral and spatial dimensions. HyCASS consists of six main modules: 1) spectral encoder module; 2) spatial encoder module; 3) compression ratio (CR) adapter encoder module; 4) CR adapter decoder module; 5) spatial decoder module; and 6) spectral decoder module. The modules employ convolutional layers and transformer blocks to capture both short-range and long-range redundancies. Experimental results on three HSI benchmark datasets demonstrate the effectiveness of our proposed adjustable model compared to existing learning-based compression models, surpassing the state of the art by up to 2.36 dB in terms of PSNR. Based on our results, we establish a guideline for effectively balancing spectral and spatial compression across different CRs, taking into account the spatial resolution of the HSIs. Our code and pre-trained model weights are publicly available at https://git.tu-berlin.de/rsim/hycass .

Adjustable Spatio-Spectral Hyperspectral Image Compression Network

TL;DR

This work tackles the challenge of efficiently compressing hyperspectral images by enabling adjustable balance between spectral and spatial information. It introduces HyCASS, a six-module network consisting of a spectral encoder, a configurable spatial encoder, CR adapter encoders/decoders, and corresponding decoders, combining CNNs and Residual Swin Transformer Blocks to capture both short-range and long-range redundancies. The authors provide extensive experiments on HySpecNet-11k, Berlin-Urban-Gradient, and MLRetSet, including ablations and comparisons against traditional and learning-based baselines, and derive practical guidelines for selecting spectral versus spatial emphasis as a function of CR and spatial resolution. The results demonstrate that spectral compression is advantageous at low CR or low spatial resolution, while spatio-spectral compression with adjustable spatial stages yields substantial gains at medium-to-high CR and higher spatial resolutions, offering a flexible approach for large-scale HSIs. The work also points toward future integration of foundation models as backbones to further enhance robustness and generalization across sensors and conditions.

Abstract

With the rapid growth of hyperspectral data archives in remote sensing (RS), the need for efficient storage has become essential, driving significant attention toward learning-based hyperspectral image (HSI) compression. However, a comprehensive investigation of the individual and joint effects of spectral and spatial compression on learning-based HSI compression has not been thoroughly examined yet. Conducting such an analysis is crucial for understanding how the exploitation of spectral, spatial, and joint spatio-spectral redundancies affects HSI compression. To address this issue, we propose Adjustable Spatio-Spectral Hyperspectral Image Compression Network (HyCASS), a learning-based model designed for adjustable HSI compression in both spectral and spatial dimensions. HyCASS consists of six main modules: 1) spectral encoder module; 2) spatial encoder module; 3) compression ratio (CR) adapter encoder module; 4) CR adapter decoder module; 5) spatial decoder module; and 6) spectral decoder module. The modules employ convolutional layers and transformer blocks to capture both short-range and long-range redundancies. Experimental results on three HSI benchmark datasets demonstrate the effectiveness of our proposed adjustable model compared to existing learning-based compression models, surpassing the state of the art by up to 2.36 dB in terms of PSNR. Based on our results, we establish a guideline for effectively balancing spectral and spatial compression across different CRs, taking into account the spatial resolution of the HSIs. Our code and pre-trained model weights are publicly available at https://git.tu-berlin.de/rsim/hycass .

Paper Structure

This paper contains 30 sections, 12 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Overview of our proposed ours model. Initially, a pixelwise convolution in the spectral encoder module extracts spectral features. The spatial encoder module, composed of $S \times$ stacked stages, performs both long- and short-range spatial feature extraction, where each spatial stage introduces higher spatial compression. Subsequently, the cr adapter encoder module adjusts the size of the latent representation to match the targeted spatio-spectral cr. The decoder mirrors the encoder structure, replacing downsampling with upsampling operations.
  • Figure 2: Architecture of (\ref{['fig:rstb-rstb']}) rstb and (\ref{['fig:rstb-stl']}) stl. The rstb captures long-range spatial redundancies using fe, stl and fu subcomponents. The stl applies multi-head self-attention within and aross local windows using ln, wa, swa and mlp subcomponents. Layout is redesigned based on liu2021swin and lu2021transformer.
  • Figure 3: An example of hsi present in the HySpecNet-11k dataset fuchs2023hyspecnet.
  • Figure 4: An example of hsi present in Berlin-Urban-Gradient dataset okujeni2016berlin.
  • Figure 5: An example of hsi present in the MLRetSet dataset omruuzun2024novel.
  • ...and 3 more figures