Table of Contents
Fetching ...

DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification

Jiamu Sheng, Jingyi Zhou, Jiong Wang, Peng Ye, Jiayuan Fan

TL;DR

This work addresses the need for accurate yet efficient hyperspectral image classification by introducing DualMamba, a lightweight parallel architecture that combines a Mamba-based global modeling stream with a lightweight CNN for local feature extraction. The model integrates dynamic positional embedding, unidirectional spatial scans and spectral bidirectional scans, cross-attention fusion, and an adaptive global-local fusion module to produce a robust global-local spectral-spatial representation. Extensive experiments on Indian Pines, WHU-Hi-Longkou, and Houston 2018 demonstrate state-of-the-art accuracy with substantial reductions in parameters and FLOPs compared to CNN, RNN, and transformer baselines, as well as strong robustness to limited training data. This approach offers a scalable, edge-friendly solution for high-dimensional HSI analysis with real-world impact in environmental monitoring and related fields.

Abstract

The effectiveness and efficiency of modeling complex spectral-spatial relations are both crucial for Hyperspectral image (HSI) classification. Most existing methods based on CNNs and transformers still suffer from heavy computational burdens and have room for improvement in capturing the global-local spectral-spatial feature representation. To this end, we propose a novel lightweight parallel design called lightweight dual-stream Mamba-convolution network (DualMamba) for HSI classification. Specifically, a parallel lightweight Mamba and CNN block are first developed to extract global and local spectral-spatial features. First, the cross-attention spectral-spatial Mamba module is proposed to leverage the global modeling of Mamba at linear complexity. Within this module, dynamic positional embedding is designed to enhance the spatial location information of visual sequences. The lightweight spectral/spatial Mamba blocks comprise an efficient scanning strategy and a lightweight Mamba design to efficiently extract global spectral-spatial features. And the cross-attention spectral-spatial fusion is designed to learn cross-correlation and fuse spectral-spatial features. Second, the lightweight spectral-spatial residual convolution module is proposed with lightweight spectral and spatial branches to extract local spectral-spatial features through residual learning. Finally, the adaptive global-local fusion is proposed to dynamically combine global Mamba features and local convolution features for a global-local spectral-spatial representation. Compared with state-of-the-art HSI classification methods, experimental results demonstrate that DualMamba achieves significant classification accuracy on three public HSI datasets and a superior reduction in model parameters and floating point operations (FLOPs).

DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification

TL;DR

This work addresses the need for accurate yet efficient hyperspectral image classification by introducing DualMamba, a lightweight parallel architecture that combines a Mamba-based global modeling stream with a lightweight CNN for local feature extraction. The model integrates dynamic positional embedding, unidirectional spatial scans and spectral bidirectional scans, cross-attention fusion, and an adaptive global-local fusion module to produce a robust global-local spectral-spatial representation. Extensive experiments on Indian Pines, WHU-Hi-Longkou, and Houston 2018 demonstrate state-of-the-art accuracy with substantial reductions in parameters and FLOPs compared to CNN, RNN, and transformer baselines, as well as strong robustness to limited training data. This approach offers a scalable, edge-friendly solution for high-dimensional HSI analysis with real-world impact in environmental monitoring and related fields.

Abstract

The effectiveness and efficiency of modeling complex spectral-spatial relations are both crucial for Hyperspectral image (HSI) classification. Most existing methods based on CNNs and transformers still suffer from heavy computational burdens and have room for improvement in capturing the global-local spectral-spatial feature representation. To this end, we propose a novel lightweight parallel design called lightweight dual-stream Mamba-convolution network (DualMamba) for HSI classification. Specifically, a parallel lightweight Mamba and CNN block are first developed to extract global and local spectral-spatial features. First, the cross-attention spectral-spatial Mamba module is proposed to leverage the global modeling of Mamba at linear complexity. Within this module, dynamic positional embedding is designed to enhance the spatial location information of visual sequences. The lightweight spectral/spatial Mamba blocks comprise an efficient scanning strategy and a lightweight Mamba design to efficiently extract global spectral-spatial features. And the cross-attention spectral-spatial fusion is designed to learn cross-correlation and fuse spectral-spatial features. Second, the lightweight spectral-spatial residual convolution module is proposed with lightweight spectral and spatial branches to extract local spectral-spatial features through residual learning. Finally, the adaptive global-local fusion is proposed to dynamically combine global Mamba features and local convolution features for a global-local spectral-spatial representation. Compared with state-of-the-art HSI classification methods, experimental results demonstrate that DualMamba achieves significant classification accuracy on three public HSI datasets and a superior reduction in model parameters and floating point operations (FLOPs).
Paper Structure (36 sections, 21 equations, 10 figures, 12 tables)

This paper contains 36 sections, 21 equations, 10 figures, 12 tables.

Figures (10)

  • Figure 1: Performance comparisons with respect to FLOPs and parameters on the Indian Pines dataset. Our DualMamba can outperform state-of-the-art methods with the fewest parameters and FLOPs.
  • Figure 2: The overview of our proposed DualMamba in (a). The core of our method leverages the cross-attention spectral-spatial Mamba module and the lightweight spectral-spatial residual convolution module to efficiently extract spectral-spatial features from global and local perspectives, respectively. The structures of lightweight spatial Mamba and lightweight spatial Mamba are illustrated in (b) and (c), respectively. Subsequently, an adaptive global-local fusion module is employed to effectively achieve a comprehensive spectral-spatial representation.
  • Figure 3: Illustration of the vanilla vision Mamba block in liu2024vmamba and our proposed lightweight spatial Mamba block.
  • Figure 4: Structure of the proposed adaptive global-local fusion.
  • Figure 5: Classification maps obtained by different methods on the Indian Pines dataset. (b) 2-D CNN (OA=87.77%). (c) 3-D CNN (OA=85.42%). (d) SSRN (OA=97.75%). (e) AB-LSTM (OA=87.08%). (f) MSRT (OA=97.75%). (g) SF (OA=92.31%). (h) SSFTT (OA=97.47%). (i) GAHT (OA=97.95%). (j) DualMamba (OA=99.23%).
  • ...and 5 more figures