Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline
Peifu Liu, Huiyan Bai, Tingfa Xu, Jihui Wang, Huan Chen, Jianan Li
TL;DR
This work addresses the lack of benchmarks for hyperspectral salient object detection by introducing HRSSD, the first dataset dedicated to HRSI-SOD, comprising 704 hyperspectral images with 32 spectral bands and 5327 pixel-level salient annotations. It also proposes DSSN, a baseline model featuring a Spatial-spectral Joint Feature Extractor, Cross-level Saliency Assessment Block, and High-resolution Fusion Module, trained with dual BCE and IoU supervision to handle large-scale variation, diverse foreground-background relations, and multi-salient objects. HRSSD enables rigorous evaluation and demonstrates that DSSN achieves state-of-the-art performance on HRSSD and strong generalization to HSOD-BIT and HS-SOD, underscoring the need for hyperspectral-specific approaches. The dataset and method advance practical HRSI-SOD capabilities and pave the way for applications in remote sensing where spectral contrast provides a decisive advantage over RGB-based cues.
Abstract
The objective of hyperspectral remote sensing image salient object detection (HRSI-SOD) is to identify objects or regions that exhibit distinct spectrum contrasts with the background. This area holds significant promise for practical applications; however, progress has been limited by a notable scarcity of dedicated datasets and methodologies. To bridge this gap and stimulate further research, we introduce the first HRSI-SOD dataset, termed HRSSD, which includes 704 hyperspectral images and 5327 pixel-level annotated salient objects. The HRSSD dataset poses substantial challenges for salient object detection algorithms due to large scale variation, diverse foreground-background relations, and multi-salient objects. Additionally, we propose an innovative and efficient baseline model for HRSI-SOD, termed the Deep Spectral Saliency Network (DSSN). The core of DSSN is the Cross-level Saliency Assessment Block, which performs pixel-wise attention and evaluates the contributions of multi-scale similarity maps at each spatial location, effectively reducing erroneous responses in cluttered regions and emphasizes salient regions across scales. Additionally, the High-resolution Fusion Module combines bottom-up fusion strategy and learned spatial upsampling to leverage the strengths of multi-scale saliency maps, ensuring accurate localization of small objects. Experiments on the HRSSD dataset robustly validate the superiority of DSSN, underscoring the critical need for specialized datasets and methodologies in this domain. Further evaluations on the HSOD-BIT and HS-SOD datasets demonstrate the generalizability of the proposed method. The dataset and source code are publicly available at https://github.com/laprf/HRSSD.
