Table of Contents
Fetching ...

FIPGNet:Pyramid grafting network with feature interaction strategies

Ziyi Ding, Like Xin

TL;DR

This work tackles the localization difficulty in pyramid graft networks for salient object detection by introducing FIPGNet, a pyramid graft network augmented with Feature Interaction Strategies (FIA). FIA employs Spatial Agent Cross Attention (SACA) for multi-scale spatial interaction and Channel Agent Cross Attention (CCM) to align backbone features with processed representations, addressing cross-scale correlation gaps. The approach includes token generation, an MLP residual refinement, and a CCM to ensure consistent feature fusion, yielding improved salient-region localization. Experiments on six datasets demonstrate competitive performance against twelve state-of-the-art methods, highlighting the practical value of cross-scale attention-driven feature interaction for robust SOD.

Abstract

Salient object detection is designed to identify the objects in an image that attract the most visual attention.Currently, the most advanced method of significance object detection adopts pyramid grafting network architecture.However, pyramid-graft network architecture still has the problem of failing to accurately locate significant targets.We observe that this is mainly due to the fact that current salient object detection methods simply aggregate different scale features, ignoring the correlation between different scale features.To overcome these problems, we propose a new salience object detection framework(FIPGNet),which is a pyramid graft network with feature interaction strategies.Specifically, we propose an attention-mechanism based feature interaction strategy (FIA) that innovatively introduces spatial agent Cross Attention (SACA) to achieve multi-level feature interaction, highlighting important spatial regions from a spatial perspective, thereby enhancing salient regions.And the channel proxy Cross Attention Module (CCM), which is used to effectively connect the features extracted by the backbone network and the features processed using the spatial proxy cross attention module, eliminating inconsistencies.Finally, under the action of these two modules, the prominent target location problem in the current pyramid grafting network model is solved.Experimental results on six challenging datasets show that the proposed method outperforms the current 12 salient object detection methods on four indicators.

FIPGNet:Pyramid grafting network with feature interaction strategies

TL;DR

This work tackles the localization difficulty in pyramid graft networks for salient object detection by introducing FIPGNet, a pyramid graft network augmented with Feature Interaction Strategies (FIA). FIA employs Spatial Agent Cross Attention (SACA) for multi-scale spatial interaction and Channel Agent Cross Attention (CCM) to align backbone features with processed representations, addressing cross-scale correlation gaps. The approach includes token generation, an MLP residual refinement, and a CCM to ensure consistent feature fusion, yielding improved salient-region localization. Experiments on six datasets demonstrate competitive performance against twelve state-of-the-art methods, highlighting the practical value of cross-scale attention-driven feature interaction for robust SOD.

Abstract

Salient object detection is designed to identify the objects in an image that attract the most visual attention.Currently, the most advanced method of significance object detection adopts pyramid grafting network architecture.However, pyramid-graft network architecture still has the problem of failing to accurately locate significant targets.We observe that this is mainly due to the fact that current salient object detection methods simply aggregate different scale features, ignoring the correlation between different scale features.To overcome these problems, we propose a new salience object detection framework(FIPGNet),which is a pyramid graft network with feature interaction strategies.Specifically, we propose an attention-mechanism based feature interaction strategy (FIA) that innovatively introduces spatial agent Cross Attention (SACA) to achieve multi-level feature interaction, highlighting important spatial regions from a spatial perspective, thereby enhancing salient regions.And the channel proxy Cross Attention Module (CCM), which is used to effectively connect the features extracted by the backbone network and the features processed using the spatial proxy cross attention module, eliminating inconsistencies.Finally, under the action of these two modules, the prominent target location problem in the current pyramid grafting network model is solved.Experimental results on six challenging datasets show that the proposed method outperforms the current 12 salient object detection methods on four indicators.
Paper Structure (17 sections, 1 equation, 8 figures, 3 tables)

This paper contains 17 sections, 1 equation, 8 figures, 3 tables.

Figures (8)

  • Figure 1: The comparison between our method and other methods in salient object detection
  • Figure 2: Illustration of different architectures. Blue blocks, pink blocks and yellow blocks respectively denote the different convolutional blocks in the encoder, the transport layer and the decoder.
  • Figure 3: Illustration of the proposed FIPGNet.The framework consists mainly of backbone network , FIA part , CCM, and decoder.
  • Figure 4: Illustration of the learning procedure of the proposed FIA modules
  • Figure 5: Illustration of the learning procedure of the proposed SACA modules
  • ...and 3 more figures