Table of Contents
Fetching ...

HyCoT: A Transformer-Based Autoencoder for Hyperspectral Image Compression

Martin Hermann Paul Fuchs, Behnood Rasti, Begüm Demir

TL;DR

HyCoT introduces a transformer-based autoencoder for hyperspectral image compression that leverages long-range spectral dependencies through a SpectralFormer-backed encoder and a lightweight MLP decoder. To accelerate training, it employs a random training-set reduction strategy, enabling efficient hyperparameter tuning on a smaller HySpecNet-11k mini split. Experiments show HyCoT surpasses state-of-the-art CAEs across compression ratios by over 1 dB PSNR while reducing training cost and computational complexity. This approach enables high-quality HSI compression with scalable performance and practical applicability in real-time or storage-constrained scenarios.

Abstract

The development of learning-based hyperspectral image (HSI) compression models has recently attracted significant interest. Existing models predominantly utilize convolutional filters, which capture only local dependencies. Furthermore,they often incur high training costs and exhibit substantial computational complexity. To address these limitations, in this paper we propose Hyperspectral Compression Transformer (HyCoT) that is a transformer-based autoencoder for pixelwise HSI compression. Additionally, we apply a simple yet effective training set reduction approach to accelerate the training process. Experimental results on the HySpecNet-11k dataset demonstrate that HyCoT surpasses the state of the art across various compression ratios by over 1 dB of PSNR with significantly reduced computational requirements. Our code and pre-trained weights are publicly available at https://git.tu-berlin.de/rsim/hycot .

HyCoT: A Transformer-Based Autoencoder for Hyperspectral Image Compression

TL;DR

HyCoT introduces a transformer-based autoencoder for hyperspectral image compression that leverages long-range spectral dependencies through a SpectralFormer-backed encoder and a lightweight MLP decoder. To accelerate training, it employs a random training-set reduction strategy, enabling efficient hyperparameter tuning on a smaller HySpecNet-11k mini split. Experiments show HyCoT surpasses state-of-the-art CAEs across compression ratios by over 1 dB PSNR while reducing training cost and computational complexity. This approach enables high-quality HSI compression with scalable performance and practical applicability in real-time or storage-constrained scenarios.

Abstract

The development of learning-based hyperspectral image (HSI) compression models has recently attracted significant interest. Existing models predominantly utilize convolutional filters, which capture only local dependencies. Furthermore,they often incur high training costs and exhibit substantial computational complexity. To address these limitations, in this paper we propose Hyperspectral Compression Transformer (HyCoT) that is a transformer-based autoencoder for pixelwise HSI compression. Additionally, we apply a simple yet effective training set reduction approach to accelerate the training process. Experimental results on the HySpecNet-11k dataset demonstrate that HyCoT surpasses the state of the art across various compression ratios by over 1 dB of PSNR with significantly reduced computational requirements. Our code and pre-trained weights are publicly available at https://git.tu-berlin.de/rsim/hycot .
Paper Structure (9 sections, 4 equations, 2 figures, 2 tables)

This paper contains 9 sections, 4 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Overview of our proposed ourmodel model. For each pixel of a given hsi, the spectral signature is padded. Neighbouring spectral bands are grouped and embedded by a linear projection hong2021spectralformer to form the transformer tokens. A ct is prepended and a learned position embedding is added. The sequence of tokens is fed into the transformer encoder vaswani2017attention. Then, the ct is extracted and projected in the mlp encoder to fit the target cr. An mlp decoder is applied to reconstruct the full spectral signature. Finally, the decoded pixels are reassembled to form the reconstructed hsi.
  • Figure 2: Rate-distortion performance on the test set of HySpecNet-11k (easy split).