Table of Contents
Fetching ...

Complete Instances Mining for Weakly Supervised Instance Segmentation

Zecheng Li, Zening Zeng, Yuqi Liang, Jin-Gang Yu

TL;DR

This work tackles weakly supervised instance segmentation using only image-level labels by addressing the persistent issue of redundant segmentation in proposal-based approaches. It introduces an online refinement framework centered on a MaskIoU head, Complete Instances Mining (CIM) to discover complete instances, and an Anti-noise strategy to suppress noisy pseudo labels, all seeded by pre-computed pseudo labels from AGPL. The method employs MaskFuse for enriched proposal representations and multiple refinement branches that progressively refine pseudo labels, guided by CIM. Experiments on VOC 2012 and COCO demonstrate state-of-the-art WSIS performance with notable gains over prior methods, highlighting the practical potential of online refinement and robust pseudo-labeling for weak supervision.

Abstract

Weakly supervised instance segmentation (WSIS) using only image-level labels is a challenging task due to the difficulty of aligning coarse annotations with the finer task. However, with the advancement of deep neural networks (DNNs), WSIS has garnered significant attention. Following a proposal-based paradigm, we encounter a redundant segmentation problem resulting from a single instance being represented by multiple proposals. For example, we feed a picture of a dog and proposals into the network and expect to output only one proposal containing a dog, but the network outputs multiple proposals. To address this problem, we propose a novel approach for WSIS that focuses on the online refinement of complete instances through the use of MaskIoU heads to predict the integrity scores of proposals and a Complete Instances Mining (CIM) strategy to explicitly model the redundant segmentation problem and generate refined pseudo labels. Our approach allows the network to become aware of multiple instances and complete instances, and we further improve its robustness through the incorporation of an Anti-noise strategy. Empirical evaluations on the PASCAL VOC 2012 and MS COCO datasets demonstrate that our method achieves state-of-the-art performance with a notable margin. Our implementation will be made available at https://github.com/ZechengLi19/CIM.

Complete Instances Mining for Weakly Supervised Instance Segmentation

TL;DR

This work tackles weakly supervised instance segmentation using only image-level labels by addressing the persistent issue of redundant segmentation in proposal-based approaches. It introduces an online refinement framework centered on a MaskIoU head, Complete Instances Mining (CIM) to discover complete instances, and an Anti-noise strategy to suppress noisy pseudo labels, all seeded by pre-computed pseudo labels from AGPL. The method employs MaskFuse for enriched proposal representations and multiple refinement branches that progressively refine pseudo labels, guided by CIM. Experiments on VOC 2012 and COCO demonstrate state-of-the-art WSIS performance with notable gains over prior methods, highlighting the practical potential of online refinement and robust pseudo-labeling for weak supervision.

Abstract

Weakly supervised instance segmentation (WSIS) using only image-level labels is a challenging task due to the difficulty of aligning coarse annotations with the finer task. However, with the advancement of deep neural networks (DNNs), WSIS has garnered significant attention. Following a proposal-based paradigm, we encounter a redundant segmentation problem resulting from a single instance being represented by multiple proposals. For example, we feed a picture of a dog and proposals into the network and expect to output only one proposal containing a dog, but the network outputs multiple proposals. To address this problem, we propose a novel approach for WSIS that focuses on the online refinement of complete instances through the use of MaskIoU heads to predict the integrity scores of proposals and a Complete Instances Mining (CIM) strategy to explicitly model the redundant segmentation problem and generate refined pseudo labels. Our approach allows the network to become aware of multiple instances and complete instances, and we further improve its robustness through the incorporation of an Anti-noise strategy. Empirical evaluations on the PASCAL VOC 2012 and MS COCO datasets demonstrate that our method achieves state-of-the-art performance with a notable margin. Our implementation will be made available at https://github.com/ZechengLi19/CIM.
Paper Structure (21 sections, 8 equations, 4 figures, 6 tables, 1 algorithm)

This paper contains 21 sections, 8 equations, 4 figures, 6 tables, 1 algorithm.

Figures (4)

  • Figure 1: Redundant segmentation. For each instance, it always corresponds to multiple proposals. Yellow boxes: expected segmentations. Red boxes: redundant segmentations.
  • Figure 2: Overview of our proposed method. Our framework mainly contains three components: an Anti-noise branch, $K$ Refinement branches, and Complete Instances Mining (CIM) strategy. Proposal features are generated by MaskFuse and forked into multiple branches. Both Anti-noise and Refinement branches output classification and integrity scores. CIM leverages output of preceding branch to generate refined pseudo labels to supervise next branch, while Anti-noise branch is supervised by pre-computed pseudo labels. In the right column, purple and red represent seeds and pseudo ground truth, respectively. The seeds spread in space to find complete proposals as pseudo ground truth through spatial relationships and integrity scores.
  • Figure 3: Visualization results on the VOC 2012 dataset. Comparison with BESTIE.
  • Figure 4: Performance of pseudo GT and seeds generated by CIM.