Table of Contents
Fetching ...

P2Seg: Pointly-supervised Segmentation via Mutual Distillation

Zipeng Wang, Xuehui Yu, Xumeng Han, Wenwen Yu, Zhixun Huang, Jianbin Jiao, Zhenjun Han

TL;DR

This work tackles point-level supervised instance segmentation by introducing a Mutual Distillation Module (MDM) that jointly leverages semantic boundaries and instance-level cues. Through two mutually reinforcing branches, Semantic to Instance (S2I) and Instance to Semantic (I2S), the method transfers information in both directions to produce high-quality instance maps without pre-trained proposals. The approach achieves state-of-the-art performance on VOC 2012 and competitive results on COCO for PSIS, with notable gains from ablative studies confirming the contributions of S2I and I2S. By effectively fusing semantic and instance information, the method reduces annotation costs while enhancing boundary precision and intra-class discrimination, signaling practical impact for scalable segmentation.

Abstract

Point-level Supervised Instance Segmentation (PSIS) aims to enhance the applicability and scalability of instance segmentation by utilizing low-cost yet instance-informative annotations. Existing PSIS methods usually rely on positional information to distinguish objects, but predicting precise boundaries remains challenging due to the lack of contour annotations. Nevertheless, weakly supervised semantic segmentation methods are proficient in utilizing intra-class feature consistency to capture the boundary contours of the same semantic regions. In this paper, we design a Mutual Distillation Module (MDM) to leverage the complementary strengths of both instance position and semantic information and achieve accurate instance-level object perception. The MDM consists of Semantic to Instance (S2I) and Instance to Semantic (I2S). S2I is guided by the precise boundaries of semantic regions to learn the association between annotated points and instance contours. I2S leverages discriminative relationships between instances to facilitate the differentiation of various objects within the semantic map. Extensive experiments substantiate the efficacy of MDM in fostering the synergy between instance and semantic information, consequently improving the quality of instance-level object representations. Our method achieves 55.7 mAP$_{50}$ and 17.6 mAP on the PASCAL VOC and MS COCO datasets, significantly outperforming recent PSIS methods and several box-supervised instance segmentation competitors.

P2Seg: Pointly-supervised Segmentation via Mutual Distillation

TL;DR

This work tackles point-level supervised instance segmentation by introducing a Mutual Distillation Module (MDM) that jointly leverages semantic boundaries and instance-level cues. Through two mutually reinforcing branches, Semantic to Instance (S2I) and Instance to Semantic (I2S), the method transfers information in both directions to produce high-quality instance maps without pre-trained proposals. The approach achieves state-of-the-art performance on VOC 2012 and competitive results on COCO for PSIS, with notable gains from ablative studies confirming the contributions of S2I and I2S. By effectively fusing semantic and instance information, the method reduces annotation costs while enhancing boundary precision and intra-class discrimination, signaling practical impact for scalable segmentation.

Abstract

Point-level Supervised Instance Segmentation (PSIS) aims to enhance the applicability and scalability of instance segmentation by utilizing low-cost yet instance-informative annotations. Existing PSIS methods usually rely on positional information to distinguish objects, but predicting precise boundaries remains challenging due to the lack of contour annotations. Nevertheless, weakly supervised semantic segmentation methods are proficient in utilizing intra-class feature consistency to capture the boundary contours of the same semantic regions. In this paper, we design a Mutual Distillation Module (MDM) to leverage the complementary strengths of both instance position and semantic information and achieve accurate instance-level object perception. The MDM consists of Semantic to Instance (S2I) and Instance to Semantic (I2S). S2I is guided by the precise boundaries of semantic regions to learn the association between annotated points and instance contours. I2S leverages discriminative relationships between instances to facilitate the differentiation of various objects within the semantic map. Extensive experiments substantiate the efficacy of MDM in fostering the synergy between instance and semantic information, consequently improving the quality of instance-level object representations. Our method achieves 55.7 mAP and 17.6 mAP on the PASCAL VOC and MS COCO datasets, significantly outperforming recent PSIS methods and several box-supervised instance segmentation competitors.
Paper Structure (14 sections, 5 equations, 12 figures, 6 tables)

This paper contains 14 sections, 5 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Two limitations of PSIS methods: Left: severe local responses caused by points separated with image features. Right: ambiguous boundaries caused by missing instance information.
  • Figure 2: The illustration of S2I and I2S Complementary Advantage. Semantic and instance information is mutually distilled, resulting in high-quality instance segmentation maps.
  • Figure 3: The architecture of MDM. In S2I branch, instance segmentation map is generated from the results of semantic segmentation using the offset map. In I2S branch, semantic segmentation results are influenced by instance segmentation map using affinity matrix. Finally, instance segmentation map generated by the trained S2I branch of MDM is used as supervision for training the segmentor.
  • Figure 4: The abstract concept of semantic and instance information interaction. Corresponding to Fig. \ref{['fig:flowchart']}, the S2I branch is colored in green, while the I2S branch is colored in blue.
  • Figure 5: The architecture of the S2I branch. Left: The semantic segmentation map generates the offset map and class maps as supervised. Right: The process of generating the new instance segmentation map.
  • ...and 7 more figures