Table of Contents
Fetching ...

Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline

Peifu Liu, Huiyan Bai, Tingfa Xu, Jihui Wang, Huan Chen, Jianan Li

TL;DR

This work addresses the lack of benchmarks for hyperspectral salient object detection by introducing HRSSD, the first dataset dedicated to HRSI-SOD, comprising 704 hyperspectral images with 32 spectral bands and 5327 pixel-level salient annotations. It also proposes DSSN, a baseline model featuring a Spatial-spectral Joint Feature Extractor, Cross-level Saliency Assessment Block, and High-resolution Fusion Module, trained with dual BCE and IoU supervision to handle large-scale variation, diverse foreground-background relations, and multi-salient objects. HRSSD enables rigorous evaluation and demonstrates that DSSN achieves state-of-the-art performance on HRSSD and strong generalization to HSOD-BIT and HS-SOD, underscoring the need for hyperspectral-specific approaches. The dataset and method advance practical HRSI-SOD capabilities and pave the way for applications in remote sensing where spectral contrast provides a decisive advantage over RGB-based cues.

Abstract

The objective of hyperspectral remote sensing image salient object detection (HRSI-SOD) is to identify objects or regions that exhibit distinct spectrum contrasts with the background. This area holds significant promise for practical applications; however, progress has been limited by a notable scarcity of dedicated datasets and methodologies. To bridge this gap and stimulate further research, we introduce the first HRSI-SOD dataset, termed HRSSD, which includes 704 hyperspectral images and 5327 pixel-level annotated salient objects. The HRSSD dataset poses substantial challenges for salient object detection algorithms due to large scale variation, diverse foreground-background relations, and multi-salient objects. Additionally, we propose an innovative and efficient baseline model for HRSI-SOD, termed the Deep Spectral Saliency Network (DSSN). The core of DSSN is the Cross-level Saliency Assessment Block, which performs pixel-wise attention and evaluates the contributions of multi-scale similarity maps at each spatial location, effectively reducing erroneous responses in cluttered regions and emphasizes salient regions across scales. Additionally, the High-resolution Fusion Module combines bottom-up fusion strategy and learned spatial upsampling to leverage the strengths of multi-scale saliency maps, ensuring accurate localization of small objects. Experiments on the HRSSD dataset robustly validate the superiority of DSSN, underscoring the critical need for specialized datasets and methodologies in this domain. Further evaluations on the HSOD-BIT and HS-SOD datasets demonstrate the generalizability of the proposed method. The dataset and source code are publicly available at https://github.com/laprf/HRSSD.

Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline

TL;DR

This work addresses the lack of benchmarks for hyperspectral salient object detection by introducing HRSSD, the first dataset dedicated to HRSI-SOD, comprising 704 hyperspectral images with 32 spectral bands and 5327 pixel-level salient annotations. It also proposes DSSN, a baseline model featuring a Spatial-spectral Joint Feature Extractor, Cross-level Saliency Assessment Block, and High-resolution Fusion Module, trained with dual BCE and IoU supervision to handle large-scale variation, diverse foreground-background relations, and multi-salient objects. HRSSD enables rigorous evaluation and demonstrates that DSSN achieves state-of-the-art performance on HRSSD and strong generalization to HSOD-BIT and HS-SOD, underscoring the need for hyperspectral-specific approaches. The dataset and method advance practical HRSI-SOD capabilities and pave the way for applications in remote sensing where spectral contrast provides a decisive advantage over RGB-based cues.

Abstract

The objective of hyperspectral remote sensing image salient object detection (HRSI-SOD) is to identify objects or regions that exhibit distinct spectrum contrasts with the background. This area holds significant promise for practical applications; however, progress has been limited by a notable scarcity of dedicated datasets and methodologies. To bridge this gap and stimulate further research, we introduce the first HRSI-SOD dataset, termed HRSSD, which includes 704 hyperspectral images and 5327 pixel-level annotated salient objects. The HRSSD dataset poses substantial challenges for salient object detection algorithms due to large scale variation, diverse foreground-background relations, and multi-salient objects. Additionally, we propose an innovative and efficient baseline model for HRSI-SOD, termed the Deep Spectral Saliency Network (DSSN). The core of DSSN is the Cross-level Saliency Assessment Block, which performs pixel-wise attention and evaluates the contributions of multi-scale similarity maps at each spatial location, effectively reducing erroneous responses in cluttered regions and emphasizes salient regions across scales. Additionally, the High-resolution Fusion Module combines bottom-up fusion strategy and learned spatial upsampling to leverage the strengths of multi-scale saliency maps, ensuring accurate localization of small objects. Experiments on the HRSSD dataset robustly validate the superiority of DSSN, underscoring the critical need for specialized datasets and methodologies in this domain. Further evaluations on the HSOD-BIT and HS-SOD datasets demonstrate the generalizability of the proposed method. The dataset and source code are publicly available at https://github.com/laprf/HRSSD.

Paper Structure

This paper contains 22 sections, 14 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison with SeaNet, an RGB image-based method that struggles to detect salient objects when foreground (orange star) and background (blue star) colors are similar due to nearly uniform channel responses. In contrast, DSSN leverages rich spectral information for improved detection. The false-color image is synthesized by mapping three spectral bands (730 nm, 580 nm, and 466 nm) to the red, green, and blue channels, respectively.
  • Figure 2: Visualization of spectral statistics. To ensure significant spectral differences between the foreground and background, we excluded images with (i) similar spectral distributions and retained those with (ii) distinct spectral distributions across different land-cover categories.
  • Figure 3: Statistical Overview of the HRSSD Dataset: (a) Distribution of salient object count per image. (b) Size distribution of salient objects. (c) Area ratio between largest and smallest objects. (d) Image distribution by foreground-to-background pixel ratio. (e) Center bias in salient object placement. (f) Width and height biases in salient object sizes.
  • Figure 4: Challenges in HRSSD. (i) and (ii) illustrate (a) large scale variations present both across different images and within a single image. The complete reversal of foreground-background separation in (iii) and (iv) underscores the (b) diverse foreground-background relations. Many of the previous examples depict (c) multiple salient objects, with additional examples shown in (v).
  • Figure 5: Overview of the Deep Spectral Saliency Network. The Spatial-spectral Joint Feature Extractor employs parallel branches to extract multi-level spatial-spectral features. The Cross-level Saliency Assessment Block computes pixel-wise attention-based similarities across levels to generate multi-scale similarity maps, which are then fused in the High-resolution Fusion Module. The structure of the Cross-level Saliency Assessment Block is illustrated with all features used as input. Red arrows indicate supervision signals.
  • ...and 4 more figures