Table of Contents
Fetching ...

SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy

Yuhan Kang, Qingpeng Li, Leyuan Fang, Jian Zhao, Xuelong Li

TL;DR

A novel deep Surrounding-Aware Network, namely SurANet, is proposed for COD tasks, which introduces surrounding information into feature extraction and loss function to improve the discrimination and experimental results demonstrate that the proposed SurANet outperforms state-of-the-art COD methods on multiple real datasets.

Abstract

Concealed object detection (COD) in cluttered scenes is significant for various image processing applications. However, due to that concealed objects are always similar to their background, it is extremely hard to distinguish them. Here, the major obstacle is the tiny feature differences between the inside and outside object boundary region, which makes it trouble for existing COD methods to achieve accurate results. In this paper, considering that the surrounding environment information can be well utilized to identify the concealed objects, and thus, we propose a novel deep Surrounding-Aware Network, namely SurANet, for COD tasks, which introduces surrounding information into feature extraction and loss function to improve the discrimination. First, we enhance the semantics of feature maps using differential fusion of surrounding features to highlight concealed objects. Next, a Surrounding-Aware Contrastive Loss is applied to identify the concealed object via learning surrounding feature maps contrastively. Then, SurANet can be trained end-to-end with high efficiency via our proposed Spatial-Compressed Correlation Transmission strategy after our investigation of feature dynamics, and extensive experiments improve that such features can be well reserved respectively. Finally, experimental results demonstrate that the proposed SurANet outperforms state-of-the-art COD methods on multiple real datasets. Our source code will be available at https://github.com/kyh433/SurANet.

SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy

TL;DR

A novel deep Surrounding-Aware Network, namely SurANet, is proposed for COD tasks, which introduces surrounding information into feature extraction and loss function to improve the discrimination and experimental results demonstrate that the proposed SurANet outperforms state-of-the-art COD methods on multiple real datasets.

Abstract

Concealed object detection (COD) in cluttered scenes is significant for various image processing applications. However, due to that concealed objects are always similar to their background, it is extremely hard to distinguish them. Here, the major obstacle is the tiny feature differences between the inside and outside object boundary region, which makes it trouble for existing COD methods to achieve accurate results. In this paper, considering that the surrounding environment information can be well utilized to identify the concealed objects, and thus, we propose a novel deep Surrounding-Aware Network, namely SurANet, for COD tasks, which introduces surrounding information into feature extraction and loss function to improve the discrimination. First, we enhance the semantics of feature maps using differential fusion of surrounding features to highlight concealed objects. Next, a Surrounding-Aware Contrastive Loss is applied to identify the concealed object via learning surrounding feature maps contrastively. Then, SurANet can be trained end-to-end with high efficiency via our proposed Spatial-Compressed Correlation Transmission strategy after our investigation of feature dynamics, and extensive experiments improve that such features can be well reserved respectively. Finally, experimental results demonstrate that the proposed SurANet outperforms state-of-the-art COD methods on multiple real datasets. Our source code will be available at https://github.com/kyh433/SurANet.

Paper Structure

This paper contains 15 sections, 18 equations, 15 figures, 5 tables, 1 algorithm.

Figures (15)

  • Figure 1: Surrounding information can be aided for detecting concealed objects. The left picture contains a concealed sand bubble crab which is difficult to identify. After noticing its around surrounding area, it is easier to identify the object and classify the details.
  • Figure 2: Overall architecture of Surrounding-Aware Network (SurANet). We proposed Surrounding-Aware Enhancement module and designed Surrounding-Aware Contrastive Loss to further enhance the surrounding awareness. Specifically, the input image passes through three different generators to obtain object features of each layer, including texture features, surrounding features, and edge features. Subsequently, the network enhances surrounding features layer by layer, through surrounding enhancer and surrounding fusion. Finally, under the supervision of Surrounding-Aware Contrastive Loss, the network progressively refines segmentation results of concealed objects through surrounding fusion. The details of edge enhance module (EEM) and texture enhance module (TEM) are implemented in fan2021concealed.
  • Figure 3: Surrounding Generator of SAE module. The surrounding label is composed of the periphery of ground truth after Gaussian blurring. Under hierarchical computation of surrounding decoder, the network generates surrounding maps on various spatial scales, effectively representing the surrounding environment.
  • Figure 4: Surrounding enhancer and surrounding fusion in SAE module. Surrounding-aware feature (Sur-Aware $G_{sur}$) is composed of texture feature $F_t$ and edge feature $F_e$ under the constraint of surrounding map. By iteratively fusing texture feature (Obj-Texture $G_{obj}$ ) and surrounding-aware feature on coarse prediction (predicted $O_c$), SurANet enhances the awareness of object features and surrounding features. Therefore, the final prediction effectively reduces environmental noise and enhances object details. The detail of texture feature and group-reversal attention (GRA) are implemented in fan2021concealed.
  • Figure 5: The proposed SACLoss and Spatial-Compressed Correlation Transmission (SCCT) strategy. SACLoss pushes away the negative sample pair formed by the pixels of the object/surrounding area, and pulls back the positive sample pair formed by the pixels of the background/surrounding area, respectively. After training process, the difference between object and background is amplified, which makes surrounding-aware feature (Sur-Aware) more obvious. The SCCT strategy transform feature maps into compact representations and enables SACLoss to efficient capture the relationships. It's divided into separation and stack operation. In this case, the number of layers $k$ is 3. Fusion features $F^{(3)}_{fusion}$ are divided into four independent parts $F^{(3)}_i$ by interval, and stack them according to the channel dimension.
  • ...and 10 more figures