HTD-Mamba: Efficient Hyperspectral Target Detection with Pyramid State Space Model
Dunbin Shen, Xuanbing Zhu, Jiacheng Tian, Jianjun Liu, Zhenrong Du, Hongyu Wang, Xiaorui Ma
TL;DR
This work tackles hyperspectral target detection under limited prior knowledge and spectral variation by introducing HTD-Mamba, a self-supervised framework that uses spectrally contrastive learning with a pyramid state-space model backbone. Key ideas include Spatial-Encoded Spectral Augmentation (SESA) to generate view pairs, group-wise spectral embeddings for spectral sequencing, and a pyramid SSM (with Mamba) to capture multiresolution, long-range spectral dependencies with linear complexity, all optimized via a spectral contrastive head. Detection is performed by matching pixel features to a prior target spectrum through cosine similarity, followed by a nonlinear background suppression step. Empirical results on four public datasets show HTD-Mamba achieving state-of-the-art target detection performance and robust background suppression, with competitive runtime; code is public at the provided repository.
Abstract
Hyperspectral target detection (HTD) identifies objects of interest from complex backgrounds at the pixel level, playing a vital role in Earth observation. However, HTD faces challenges due to limited prior knowledge and spectral variation, leading to underfitting models and unreliable performance. To address these challenges, this paper proposes an efficient self-supervised HTD method with a pyramid state space model (SSM), named HTD-Mamba, which employs spectrally contrastive learning to distinguish between target and background based on the similarity measurement of intrinsic features. Specifically, to obtain sufficient training samples and leverage spatial contextual information, we propose a spatial-encoded spectral augmentation technique that encodes all surrounding pixels within a patch into a transformed view of the center pixel. Additionally, to explore global band correlations, we divide pixels into continuous group-wise spectral embeddings and introduce Mamba to HTD for the first time to model long-range dependencies of the spectral sequence with linear complexity. Furthermore, to alleviate spectral variation and enhance robust representation, we propose a pyramid SSM as a backbone to capture and fuse multiresolution spectral-wise intrinsic features. Extensive experiments conducted on four public datasets demonstrate that the proposed method outperforms state-of-the-art methods in both quantitative and qualitative evaluations. Code is available at \url{https://github.com/shendb2022/HTD-Mamba}.
