Table of Contents
Fetching ...

Synthesize Boundaries: A Boundary-aware Self-consistent Framework for Weakly Supervised Salient Object Detection

Binwei Xu, Haoran Liang, Ronghua Liang, Peng Chen

TL;DR

This work tackles boundary-accurate salient object detection under scribble-based weak supervision by introducing boundary supervision through synthetic concave regions inserted into scribble-labeled objects. A self-consistent framework with a Global Integral Branch (GIB) and a Boundary-Aware Branch (BAB) leverages real and synthetic images to jointly learn complete object saliency and sharp boundaries, reinforced by a carefully designed loss that fuses saliency structure, local coherence, and cross-branch consistency. Empirical results on five benchmarks demonstrate clear gains over state-of-the-art scribble-based methods and competitive performance against fully supervised approaches, validating the effectiveness of boundary-oriented synthetic data and the self-consistent training strategy. The approach offers a practical path to improving boundary quality in weakly supervised SOD without requiring extra data, with potential extensions to learning-based synthetic generation and broader weakly supervised tasks.

Abstract

Fully supervised salient object detection (SOD) has made considerable progress based on expensive and time-consuming data with pixel-wise annotations. Recently, to relieve the labeling burden while maintaining performance, some scribble-based SOD methods have been proposed. However, learning precise boundary details from scribble annotations that lack edge information is still difficult. In this paper, we propose to learn precise boundaries from our designed synthetic images and labels without introducing any extra auxiliary data. The synthetic image creates boundary information by inserting synthetic concave regions that simulate the real concave regions of salient objects. Furthermore, we propose a novel self-consistent framework that consists of a global integral branch (GIB) and a boundary-aware branch (BAB) to train a saliency detector. GIB aims to identify integral salient objects, whose input is the original image. BAB aims to help predict accurate boundaries, whose input is the synthetic image. These two branches are connected through a self-consistent loss to guide the saliency detector to predict precise boundaries while identifying salient objects. Experimental results on five benchmarks demonstrate that our method outperforms the state-of-the-art weakly supervised SOD methods and further narrows the gap with the fully supervised methods.

Synthesize Boundaries: A Boundary-aware Self-consistent Framework for Weakly Supervised Salient Object Detection

TL;DR

This work tackles boundary-accurate salient object detection under scribble-based weak supervision by introducing boundary supervision through synthetic concave regions inserted into scribble-labeled objects. A self-consistent framework with a Global Integral Branch (GIB) and a Boundary-Aware Branch (BAB) leverages real and synthetic images to jointly learn complete object saliency and sharp boundaries, reinforced by a carefully designed loss that fuses saliency structure, local coherence, and cross-branch consistency. Empirical results on five benchmarks demonstrate clear gains over state-of-the-art scribble-based methods and competitive performance against fully supervised approaches, validating the effectiveness of boundary-oriented synthetic data and the self-consistent training strategy. The approach offers a practical path to improving boundary quality in weakly supervised SOD without requiring extra data, with potential extensions to learning-based synthetic generation and broader weakly supervised tasks.

Abstract

Fully supervised salient object detection (SOD) has made considerable progress based on expensive and time-consuming data with pixel-wise annotations. Recently, to relieve the labeling burden while maintaining performance, some scribble-based SOD methods have been proposed. However, learning precise boundary details from scribble annotations that lack edge information is still difficult. In this paper, we propose to learn precise boundaries from our designed synthetic images and labels without introducing any extra auxiliary data. The synthetic image creates boundary information by inserting synthetic concave regions that simulate the real concave regions of salient objects. Furthermore, we propose a novel self-consistent framework that consists of a global integral branch (GIB) and a boundary-aware branch (BAB) to train a saliency detector. GIB aims to identify integral salient objects, whose input is the original image. BAB aims to help predict accurate boundaries, whose input is the synthetic image. These two branches are connected through a self-consistent loss to guide the saliency detector to predict precise boundaries while identifying salient objects. Experimental results on five benchmarks demonstrate that our method outperforms the state-of-the-art weakly supervised SOD methods and further narrows the gap with the fully supervised methods.
Paper Structure (27 sections, 9 equations, 6 figures, 6 tables)

This paper contains 27 sections, 9 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Sample results of our scribble-based SOD method are compared with SCWSSOD yu2021structure. By adding our proposed synthetic images as training data, our method can perceive tortuous edges and predict a more accurate boundary.
  • Figure 2: Overview of our proposed self-consistent framework. It consists of a global integral branch (GIB) and a boundary-aware branch (BAB). GIB trained on original images aims to identify integral salient objects, while BAB trained on synthetic images aims to help predict accurate boundaries. LSC loss $L_{lsc}$ is applied with saliency structure consistency (SSC) loss $L_{ssc}$ and partial cross entropy loss $L_{pce}$ to optimize the global integral branch. Local saliency coherence loss $L_{lsc}$ and partial cross entropy loss $L_{pce}$ are applied to optimize the boundary-aware branch. Self-consistent loss is employed to associate them.
  • Figure 3: Illustration of synthetic image generation, which primarily consists of endpoint selection, concave region generation, and texture generation.
  • Figure 4: Comparison of predicted saliency maps for different saliency detectors. (a) a saliency detector without introducing synthetic images as training data. (b) a saliency detector that adds synthetic images as training data. (c) a saliency detector combined with the self-consistent framework.
  • Figure 5: Visual comparisons of various method. Each column denotes one approach and each row demonstrates saliency maps of one image. Apparently, our method can predict more complete salient objects and can more clearly distinguish the boundaries of salient objects than other methods.
  • ...and 1 more figures