Table of Contents
Fetching ...

CW-BASS: Confidence-Weighted Boundary-Aware Learning for Semi-Supervised Semantic Segmentation

Ebenezer Tarubinga, Jenifer Kalafatovich, Seong-Whan Lee

TL;DR

CW-BASS tackles semi-supervised semantic segmentation by addressing coupling, confirmation bias, and boundary blur through a two-stage teacher-student framework. It introduces a confidence-weighted cross-entropy loss, a dynamic thresholding policy, a Sobel-based boundary-aware module, and a confidence decay strategy to progressively refine pseudo-labels and boundaries. The approach achieves state-of-the-art results on Pascal VOC 2012 and Cityscapes, including 75.81% mIoU on Pascal VOC with 1/8 labeled data and 65.87% mIoU on Cityscapes with 1/30 labeled data, while using two networks and reduced unlabeled data for faster convergence. These contributions offer practical gains for learning from limited annotations and enable efficient deployment in data-constrained settings.

Abstract

Semi-supervised semantic segmentation (SSSS) aims to improve segmentation performance by utilizing large amounts of unlabeled data with limited labeled samples. Existing methods often suffer from coupling, where over-reliance on initial labeled data leads to suboptimal learning; confirmation bias, where incorrect predictions reinforce themselves repeatedly; and boundary blur caused by limited boundary-awareness and ambiguous edge cues. To address these issues, we propose CW-BASS, a novel framework for SSSS. In order to mitigate the impact of incorrect predictions, we assign confidence weights to pseudo-labels. Additionally, we leverage boundary-delineation techniques, which, despite being extensively explored in weakly-supervised semantic segmentation (WSSS), remain underutilized in SSSS. Specifically, our method: (1) reduces coupling via a confidence-weighted loss that adjusts pseudo-label influence based on their predicted confidence scores, (2) mitigates confirmation bias with a dynamic thresholding mechanism that learns to filter out pseudo-labels based on model performance, (3) tackles boundary blur using a boundary-aware module to refine segmentation near object edges, and (4) reduces label noise through a confidence decay strategy that progressively refines pseudo-labels during training. Extensive experiments on Pascal VOC 2012 and Cityscapes demonstrate that CW-BASS achieves state-of-the-art performance. Notably, CW-BASS achieves a 65.9% mIoU on Cityscapes under a challenging and underexplored 1/30 (3.3%) split (100 images), highlighting its effectiveness in limited-label settings. Our code is available at https://github.com/psychofict/CW-BASS.

CW-BASS: Confidence-Weighted Boundary-Aware Learning for Semi-Supervised Semantic Segmentation

TL;DR

CW-BASS tackles semi-supervised semantic segmentation by addressing coupling, confirmation bias, and boundary blur through a two-stage teacher-student framework. It introduces a confidence-weighted cross-entropy loss, a dynamic thresholding policy, a Sobel-based boundary-aware module, and a confidence decay strategy to progressively refine pseudo-labels and boundaries. The approach achieves state-of-the-art results on Pascal VOC 2012 and Cityscapes, including 75.81% mIoU on Pascal VOC with 1/8 labeled data and 65.87% mIoU on Cityscapes with 1/30 labeled data, while using two networks and reduced unlabeled data for faster convergence. These contributions offer practical gains for learning from limited annotations and enable efficient deployment in data-constrained settings.

Abstract

Semi-supervised semantic segmentation (SSSS) aims to improve segmentation performance by utilizing large amounts of unlabeled data with limited labeled samples. Existing methods often suffer from coupling, where over-reliance on initial labeled data leads to suboptimal learning; confirmation bias, where incorrect predictions reinforce themselves repeatedly; and boundary blur caused by limited boundary-awareness and ambiguous edge cues. To address these issues, we propose CW-BASS, a novel framework for SSSS. In order to mitigate the impact of incorrect predictions, we assign confidence weights to pseudo-labels. Additionally, we leverage boundary-delineation techniques, which, despite being extensively explored in weakly-supervised semantic segmentation (WSSS), remain underutilized in SSSS. Specifically, our method: (1) reduces coupling via a confidence-weighted loss that adjusts pseudo-label influence based on their predicted confidence scores, (2) mitigates confirmation bias with a dynamic thresholding mechanism that learns to filter out pseudo-labels based on model performance, (3) tackles boundary blur using a boundary-aware module to refine segmentation near object edges, and (4) reduces label noise through a confidence decay strategy that progressively refines pseudo-labels during training. Extensive experiments on Pascal VOC 2012 and Cityscapes demonstrate that CW-BASS achieves state-of-the-art performance. Notably, CW-BASS achieves a 65.9% mIoU on Cityscapes under a challenging and underexplored 1/30 (3.3%) split (100 images), highlighting its effectiveness in limited-label settings. Our code is available at https://github.com/psychofict/CW-BASS.

Paper Structure

This paper contains 33 sections, 16 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Qualitative Comparison with SOTA methods on the Pascal VOC 2012 under 1/8 supervised protocol. Our method, CW-BASS excels in limited label cases as shown above. From left to right: Input Image, Ground Truth, ST++ yang2022st++, UniMatch yang2023revisiting and CW-BASS (Ours). Red rectangles highlight regions where our method improves segmentation performance.
  • Figure 2: Overview of the CW-BASS Framework. In Stage 1, the teacher model generates pseudo-labels with confidence scores for unlabeled data. The confidence-weighted loss and dynamic thresholding filter reliable predictions to train the student model. In Stage 2, a confidence decay strategy and boundary-aware module progressively improve segmentation accuracy near object boundaries.
  • Figure 3: Qualitative comparisons of our method, CW-BASS with other state-of-the-art methods, ST++ yang2022st++ and UniMatch yang2023revisiting on the PASCAL VOC 2012 and Cityscapes datasets. Red rectangles highlight regions of improved segmentation performance from the baseline. All methods are compared using ResNet-50 Backbone under the 1/8 setting for validation.
  • Figure 4: Comparison of performance at initial 20 epochs for SOTA methods on Cityscapes 1/16 supervised partition. Our method initializes faster than SOTA methods PS-MT and ST++.
  • Figure 5: Ablation studies comparisons on Cityscapes, 1/8 split. We visualize the predicted mask over the input image.