Table of Contents
Fetching ...

BAISeg: Boundary Assisted Weakly Supervised Instance Segmentation

Tengbo Wang, Yu Bai

TL;DR

Boundary-Assisted Instance Segmentation (BAISeg) is proposed, which is a novel paradigm for WSIS that realizes instance segmentation with pixel-level annotations that strengthens the continuity and closedness of the instance boundaries.

Abstract

How to extract instance-level masks without instance-level supervision is the main challenge of weakly supervised instance segmentation (WSIS). Popular WSIS methods estimate a displacement field (DF) via learning inter-pixel relations and perform clustering to identify instances. However, the resulting instance centroids are inherently unstable and vary significantly across different clustering algorithms. In this paper, we propose Boundary-Assisted Instance Segmentation (BAISeg), which is a novel paradigm for WSIS that realizes instance segmentation with pixel-level annotations. BAISeg comprises an instance-aware boundary detection (IABD) branch and a semantic segmentation branch. The IABD branch identifies instances by predicting class-agnostic instance boundaries rather than instance centroids, therefore, it is different from previous DF-based approaches. In particular, we proposed the Cascade Fusion Module (CFM) and the Deep Mutual Attention (DMA) in the IABD branch to obtain rich contextual information and capture instance boundaries with weak responses. During the training phase, we employed Pixel-to-Pixel Contrast to enhance the discriminative capacity of the IABD branch. This further strengthens the continuity and closedness of the instance boundaries. Extensive experiments on PASCAL VOC 2012 and MS COCO demonstrate the effectiveness of our approach, and we achieve considerable performance with only pixel-level annotations. The code will be available at https://github.com/wsis-seg/BAISeg.

BAISeg: Boundary Assisted Weakly Supervised Instance Segmentation

TL;DR

Boundary-Assisted Instance Segmentation (BAISeg) is proposed, which is a novel paradigm for WSIS that realizes instance segmentation with pixel-level annotations that strengthens the continuity and closedness of the instance boundaries.

Abstract

How to extract instance-level masks without instance-level supervision is the main challenge of weakly supervised instance segmentation (WSIS). Popular WSIS methods estimate a displacement field (DF) via learning inter-pixel relations and perform clustering to identify instances. However, the resulting instance centroids are inherently unstable and vary significantly across different clustering algorithms. In this paper, we propose Boundary-Assisted Instance Segmentation (BAISeg), which is a novel paradigm for WSIS that realizes instance segmentation with pixel-level annotations. BAISeg comprises an instance-aware boundary detection (IABD) branch and a semantic segmentation branch. The IABD branch identifies instances by predicting class-agnostic instance boundaries rather than instance centroids, therefore, it is different from previous DF-based approaches. In particular, we proposed the Cascade Fusion Module (CFM) and the Deep Mutual Attention (DMA) in the IABD branch to obtain rich contextual information and capture instance boundaries with weak responses. During the training phase, we employed Pixel-to-Pixel Contrast to enhance the discriminative capacity of the IABD branch. This further strengthens the continuity and closedness of the instance boundaries. Extensive experiments on PASCAL VOC 2012 and MS COCO demonstrate the effectiveness of our approach, and we achieve considerable performance with only pixel-level annotations. The code will be available at https://github.com/wsis-seg/BAISeg.
Paper Structure (42 sections, 8 equations, 10 figures, 12 tables)

This paper contains 42 sections, 8 equations, 10 figures, 12 tables.

Figures (10)

  • Figure 1: The diagrams illustrate our proposed approach BAISeg, with both label generation and the prediction of instance segmentation masks. Compared to the existing methods, ours does not require instance-level annotations and can be derived from existing semantic segmentation masks, resulting in higher efficiency. Samples are from the PASCAL VOC 2012 dataset everingham2010pascal.
  • Figure 2: Our proposed BAISeg architecture (a) mainly consists of two parallel branches with a shared backbone: the IABD (b) branch and the semantic segmentation branch. The IABD branch determines the boundaries of instances by predicting instance-aware boundary maps and extracts class-agnostic masks via the Mask Extraction Pipeline. The semantic segmentation branch predicts the semantic maps of instances. Instance segmentations are derived by combining the semantic maps and class-agnostic instance masks. The entire network is optimized by minimizing the $\mathcal{L}_\text{P}$, $\mathcal{L}_\text{B}$, and $\mathcal{L}_\text{S}$ losses on pixel-level annotations. The $T$ function is used to extract semantic contour edge labels from the segmentation masks by spatial gradient deriving. This sample is taken from the validation set of PASCAL VOC 2012.
  • Figure 3: Mutual Attention Unit.
  • Figure 4: Visualization results on the COCO 2017 dataset. Comparison with CIM zecheng2023CIM.
  • Figure 5: Illustration of the estimated Displacement Field and Centroids under different levels of annotation. "p_thr=0.1" represents the Centroids generated after filtering with a threshold of 0.1 using pixel-level annotation. Visualization content is generated by BESTIE.
  • ...and 5 more figures