Table of Contents
Fetching ...

Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification

Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Muhammad Usama, Hamad Ahmed Altuwaijri, Manuel Mazzara, Salvatore Distefano

TL;DR

This work tackles hyperspectral image classification (HSIC) by addressing both the high dimensionality and the need to capture long-range spatial-spectral and temporal dependencies. It introduces MHSSMamba, a Spatial-Spectral Mamba variant that uses spectral-spatial token generation and center-context token enhancement, guided by a multi-head self-attention mechanism and a state-space model to incorporate sequential spectral dynamics. Key contributions include: (i) separate spectral and spatial token extraction with enhancement, (ii) a specialized multi-head attention scheme for cross-domain tokens, (iii) a gating-based token enhancement module, and (iv) integration of a state-space model for temporal context, all in an end-to-end HSIC pipeline. Across four public HSIs, MHSSMamba achieves state-of-the-art performance, demonstrating improved feature representation, efficiency, and robustness for spectral-spatial classification tasks.

Abstract

Spatial-Spectral Mamba (SSM) improves computational efficiency and captures long-range dependencies, addressing Transformer limitations. However, traditional Mamba models overlook rich spectral information in HSIs and struggle with high dimensionality and sequential data. To address these issues, we propose the SSM with multi-head self-attention and token enhancement (MHSSMamba). This model integrates spectral and spatial information by enhancing spectral tokens and using multi-head attention to capture complex relationships between spectral bands and spatial locations. It also manages long-range dependencies and the sequential nature of HSI data, preserving contextual information across spectral bands. MHSSMamba achieved remarkable classification accuracies of 97.62\% on Pavia University, 96.92\% on the University of Houston, 96.85\% on Salinas, and 99.49\% on Wuhan-longKou datasets. The source code is available at \href{https://github.com/MHassaanButt/MHA\_SS\_Mamba}{GitHub}.

Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification

TL;DR

This work tackles hyperspectral image classification (HSIC) by addressing both the high dimensionality and the need to capture long-range spatial-spectral and temporal dependencies. It introduces MHSSMamba, a Spatial-Spectral Mamba variant that uses spectral-spatial token generation and center-context token enhancement, guided by a multi-head self-attention mechanism and a state-space model to incorporate sequential spectral dynamics. Key contributions include: (i) separate spectral and spatial token extraction with enhancement, (ii) a specialized multi-head attention scheme for cross-domain tokens, (iii) a gating-based token enhancement module, and (iv) integration of a state-space model for temporal context, all in an end-to-end HSIC pipeline. Across four public HSIs, MHSSMamba achieves state-of-the-art performance, demonstrating improved feature representation, efficiency, and robustness for spectral-spatial classification tasks.

Abstract

Spatial-Spectral Mamba (SSM) improves computational efficiency and captures long-range dependencies, addressing Transformer limitations. However, traditional Mamba models overlook rich spectral information in HSIs and struggle with high dimensionality and sequential data. To address these issues, we propose the SSM with multi-head self-attention and token enhancement (MHSSMamba). This model integrates spectral and spatial information by enhancing spectral tokens and using multi-head attention to capture complex relationships between spectral bands and spatial locations. It also manages long-range dependencies and the sequential nature of HSI data, preserving contextual information across spectral bands. MHSSMamba achieved remarkable classification accuracies of 97.62\% on Pavia University, 96.92\% on the University of Houston, 96.85\% on Salinas, and 99.49\% on Wuhan-longKou datasets. The source code is available at \href{https://github.com/MHassaanButt/MHA\_SS\_Mamba}{GitHub}.
Paper Structure (6 sections, 5 equations, 5 figures, 3 tables)

This paper contains 6 sections, 5 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: A joint spatial-spectral feature token is first computed from the HSI. These tokens are then encoded in the MHSSMamba model, which includes token enhancement and a multi-head attention module, allowing for a more selective and effective representation of information compared to standard fixed-dimension encodings. The output is subsequently processed through a state-space model, followed by normalization and a linear layer, before being passed to the classification head for ground truth generation.
  • Figure 2: Qualitative results of the University of Houston Dataset.
  • Figure 3: Qualitative results of the Pavia University Dataset.
  • Figure 4: Qualitative results of the Salinas Dataset.
  • Figure 5: Qualitative results of the WHU-Hi-LongKou Dataset.