Table of Contents
Fetching ...

United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images

Yanguang Sun, Jian Yang, Lei Luo

TL;DR

A novel united domain cognition network (UDCNet) to jointly explore the global-local information in the frequency and spatial domains is proposed and the superiority of the proposed UDCNet method over 24 state-of-the-art models is demonstrated.

Abstract

Recently, deep learning-based salient object detection (SOD) in optical remote sensing images (ORSIs) have achieved significant breakthroughs. We observe that existing ORSIs-SOD methods consistently center around optimizing pixel features in the spatial domain, progressively distinguishing between backgrounds and objects. However, pixel information represents local attributes, which are often correlated with their surrounding context. Even with strategies expanding the local region, spatial features remain biased towards local characteristics, lacking the ability of global perception. To address this problem, we introduce the Fourier transform that generate global frequency features and achieve an image-size receptive field. To be specific, we propose a novel United Domain Cognition Network (UDCNet) to jointly explore the global-local information in the frequency and spatial domains. Technically, we first design a frequency-spatial domain transformer block that mutually amalgamates the complementary local spatial and global frequency features to strength the capability of initial input features. Furthermore, a dense semantic excavation module is constructed to capture higher-level semantic for guiding the positioning of remote sensing objects. Finally, we devise a dual-branch joint optimization decoder that applies the saliency and edge branches to generate high-quality representations for predicting salient objects. Experimental results demonstrate the superiority of the proposed UDCNet method over 24 state-of-the-art models, through extensive quantitative and qualitative comparisons in three widely-used ORSIs-SOD datasets. The source code is available at: \href{https://github.com/CSYSI/UDCNet}{\color{blue} https://github.com/CSYSI/UDCNet}.

United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images

TL;DR

A novel united domain cognition network (UDCNet) to jointly explore the global-local information in the frequency and spatial domains is proposed and the superiority of the proposed UDCNet method over 24 state-of-the-art models is demonstrated.

Abstract

Recently, deep learning-based salient object detection (SOD) in optical remote sensing images (ORSIs) have achieved significant breakthroughs. We observe that existing ORSIs-SOD methods consistently center around optimizing pixel features in the spatial domain, progressively distinguishing between backgrounds and objects. However, pixel information represents local attributes, which are often correlated with their surrounding context. Even with strategies expanding the local region, spatial features remain biased towards local characteristics, lacking the ability of global perception. To address this problem, we introduce the Fourier transform that generate global frequency features and achieve an image-size receptive field. To be specific, we propose a novel United Domain Cognition Network (UDCNet) to jointly explore the global-local information in the frequency and spatial domains. Technically, we first design a frequency-spatial domain transformer block that mutually amalgamates the complementary local spatial and global frequency features to strength the capability of initial input features. Furthermore, a dense semantic excavation module is constructed to capture higher-level semantic for guiding the positioning of remote sensing objects. Finally, we devise a dual-branch joint optimization decoder that applies the saliency and edge branches to generate high-quality representations for predicting salient objects. Experimental results demonstrate the superiority of the proposed UDCNet method over 24 state-of-the-art models, through extensive quantitative and qualitative comparisons in three widely-used ORSIs-SOD datasets. The source code is available at: \href{https://github.com/CSYSI/UDCNet}{\color{blue} https://github.com/CSYSI/UDCNet}.

Paper Structure

This paper contains 35 sections, 14 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Visual results between the proposed UDCNet model and existing spatial domain-based ORSIs-SOD methods ($i.e.$, MCCNet MCCNet, MJRBM ORSI-4199, HFANet HFANet, and UG2L UG2L).
  • Figure 2: Overall framework of our UDCNet method. We use the ResNet50 ResNet or PVTv2 Pvt2 as the backbone, and design the frequency-spatial domain transformer (FSDT) block that contains a spatial perception self-attention (SPSA), a frequency perception self-attention (FPSA), an adaptive fusion strategy (AFS) and a cross-domain feed-forward network (CDFFN) to simultaneously model global relationships and local details. Furthermore, we propose the dense semantic excavation (DSE) module to perform semantic enhancement. In addition, we design the dual-branch joint optimization (DJO) decoder to integrate multi-level features for predicting high-quality saliency maps.
  • Figure 3: Details of the frequency-spatial domain transformer block.
  • Figure 4: Details of the dual-branch joint optimization decoder.
  • Figure 5: Quantitative resluts of the $PR$ and $F_m$ curves for UDCNet and other SOTA methods on three ORSIs-SOD datasets.
  • ...and 9 more figures