Table of Contents
Fetching ...

Spatial and Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification

Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Adil Mehmood Khan, Manuel Mazzara, Salvatore Distefano, Muhammad Usama, Swalpa Kumar Roy, Jocelyn Chanussot, Danfeng Hong

TL;DR

MorpMamba tackles the efficiency bottlenecks of CNNs and Transformers in hyperspectral image classification by integrating morphological operations with the Mamba/SSM sequence-modeling paradigm. It introduces spatial-spectral tokens generated via erosion and dilation, followed by a center-region gated token enhancement and multi-head self-attention, then processes enhanced features with a state-space model for scalable temporal dynamics. The approach yields competitive classification accuracy with dramatically fewer parameters and linear computational complexity, outperforming several SOTA models on multiple HSIs while offering robustness to noise and structural variation. The work suggests MorpMamba as a practical, scalable alternative for resource-constrained deployment and provides a foundation for future multi-modal and multi-temporal hyperspectral analysis.

Abstract

Recent advancements in transformers, specifically self-attention mechanisms, have significantly improved hyperspectral image (HSI) classification. However, these models often suffer from inefficiencies, as their computational complexity scales quadratically with sequence length. To address these challenges, we propose the morphological spatial mamba (SMM) and morphological spatial-spectral Mamba (SSMM) model (MorpMamba), which combines the strengths of morphological operations and the state space model framework, offering a more computationally efficient alternative to transformers. In MorpMamba, a novel token generation module first converts HSI patches into spatial-spectral tokens. These tokens are then processed through morphological operations such as erosion and dilation, utilizing depthwise separable convolutions to capture structural and shape information. A token enhancement module refines these features by dynamically adjusting the spatial and spectral tokens based on central HSI regions, ensuring effective feature fusion within each block. Subsequently, multi-head self-attention is applied to further enrich the feature representations, allowing the model to capture complex relationships and dependencies within the data. Finally, the enhanced tokens are fed into a state space module, which efficiently models the temporal evolution of the features for classification. Experimental results on widely used HSI datasets demonstrate that MorpMamba achieves superior parametric efficiency compared to traditional CNN and transformer models while maintaining high accuracy. The code will be made publicly available at \url{https://github.com/mahmad000/MorpMamba}.

Spatial and Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification

TL;DR

MorpMamba tackles the efficiency bottlenecks of CNNs and Transformers in hyperspectral image classification by integrating morphological operations with the Mamba/SSM sequence-modeling paradigm. It introduces spatial-spectral tokens generated via erosion and dilation, followed by a center-region gated token enhancement and multi-head self-attention, then processes enhanced features with a state-space model for scalable temporal dynamics. The approach yields competitive classification accuracy with dramatically fewer parameters and linear computational complexity, outperforming several SOTA models on multiple HSIs while offering robustness to noise and structural variation. The work suggests MorpMamba as a practical, scalable alternative for resource-constrained deployment and provides a foundation for future multi-modal and multi-temporal hyperspectral analysis.

Abstract

Recent advancements in transformers, specifically self-attention mechanisms, have significantly improved hyperspectral image (HSI) classification. However, these models often suffer from inefficiencies, as their computational complexity scales quadratically with sequence length. To address these challenges, we propose the morphological spatial mamba (SMM) and morphological spatial-spectral Mamba (SSMM) model (MorpMamba), which combines the strengths of morphological operations and the state space model framework, offering a more computationally efficient alternative to transformers. In MorpMamba, a novel token generation module first converts HSI patches into spatial-spectral tokens. These tokens are then processed through morphological operations such as erosion and dilation, utilizing depthwise separable convolutions to capture structural and shape information. A token enhancement module refines these features by dynamically adjusting the spatial and spectral tokens based on central HSI regions, ensuring effective feature fusion within each block. Subsequently, multi-head self-attention is applied to further enrich the feature representations, allowing the model to capture complex relationships and dependencies within the data. Finally, the enhanced tokens are fed into a state space module, which efficiently models the temporal evolution of the features for classification. Experimental results on widely used HSI datasets demonstrate that MorpMamba achieves superior parametric efficiency compared to traditional CNN and transformer models while maintaining high accuracy. The code will be made publicly available at \url{https://github.com/mahmad000/MorpMamba}.
Paper Structure (11 sections, 12 equations, 12 figures, 3 tables)

This paper contains 11 sections, 12 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: A joint spatial-spectral feature token is first computed from the HSI using morphological operations. These tokens are then integrated into the MorpMamba model, which includes Erosion and Dilation operations, token enhancement, and a multi-head attention module. This method allows a more selective and effective representation of information as compared to standard fixed-dimension encodings. The output is then processed through an SSM, followed by feature normalization and a linear layer with $l_2^2$ regularization. Finally, this output is passed to the classification head for generating the ground truth.
  • Figure 2: Spatial-Spectral token enhancement module adopted to refine and enhancement of the extracted spatial and spectral features.
  • Figure 3: Multi-head self-attention module adapted to interact with the enhanced spatial and spectral features.
  • Figure 4: OA of MorpMamba across different training data ratios (1%, 2%, 5%, 10%, 15%, 20%, and 25%), patch sizes ($4 \times 4$. Different patch sizes $2 \times 2$, $4 \times 4$, $6 \times 6$, $8 \times 8$, and $10 \times 10$), number of heads (2, 4, 6, and 8), and kernel sizes ($3 \times 3$, $5 \times 5$, $7 \times 7$, $9 \times 9$, and $11 \times 11$) over 50 epochs on WHU-Hi-LongKou, Pavia Centre, Pavia University, Salinas, and University of Houston datasets.
  • Figure 5: Training Time of MorpMamba across different training data ratios, patch sizes, number of heads, and kernel sizes over 50 epochs on WHU-Hi-LongKou, Pavia Centre, Pavia University, Salinas, and University of Houston datasets. The training ratio and patch size have a strong influence on computational time, whereas the head size within multi-head self-attention and the kernel size within morphological operations do not significantly affect the computational load. This demonstrates that the Mamba model maintains a linear computational load even after incorporating multi-head self-attention and morphological operations. However, these additions significantly improve performance, as shown in the subsequent sections.
  • ...and 7 more figures