Hybrid State-Space and GRU-based Graph Tokenization Mamba for Hyperspectral Image Classification
Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Muhammad Usama, Manuel Mazzara, Salvatore Distefano, Adil Mehmood Khan, Danfeng Hong
TL;DR
GraphMamba introduces a hybrid framework for hyperspectral image classification that fuses spectral-spatial tokenization, graph token prioritization, cross-attention, and a GRU-based state-space model to capture complex spectral-spatial dynamics while maintaining scalability. The dual-tokenization uses $1\times 1$ spectral convolutions and $3\times 3$ spatial convolutions, followed by graph-based prioritization, a cross-attention module, and a GRU-driven sequence model to produce robust classifications. Ablation studies show that combining graph tokenization with attention yields the strongest performance across diverse datasets, achieving state-of-the-art accuracy with significantly fewer parameters than CNN/Transformer baselines and competitive runtime and memory profiles. The approach demonstrates strong generalization across datasets with varying spectral bands and resolutions, highlighting its practicality for real-world HSI tasks and resource-constrained deployments.
Abstract
Hyperspectral image (HSI) classification plays a pivotal role in domains such as environmental monitoring, agriculture, and urban planning. However, it faces significant challenges due to the high-dimensional nature of the data and the complex spectral-spatial relationships inherent in HSI. Traditional methods, including conventional machine learning and convolutional neural networks (CNNs), often struggle to effectively capture these intricate spectral-spatial features and global contextual information. Transformer-based models, while powerful in capturing long-range dependencies, often demand substantial computational resources, posing challenges in scenarios where labeled datasets are limited, as is commonly seen in HSI applications. To overcome these challenges, this work proposes GraphMamba, a hybrid model that combines spectral-spatial token generation, graph-based token prioritization, and cross-attention mechanisms. The model introduces a novel hybridization of state-space modeling and Gated Recurrent Units (GRU), capturing both linear and nonlinear spatial-spectral dynamics. GraphMamba enhances the ability to model complex spatial-spectral relationships while maintaining scalability and computational efficiency across diverse HSI datasets. Through comprehensive experiments, we demonstrate that GraphMamba outperforms existing state-of-the-art models, offering a scalable and robust solution for complex HSI classification tasks.
