Table of Contents
Fetching ...

DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation

Yuanchen Wu, Xichen Ye, Kequan Yang, Jide Li, Xiaoqiang Li

TL;DR

A dual student framework with trustworthy progressive learning (DuPL) is proposed with a discrepancy loss to yield diverse CAMs for each sub-net, mitigating the confirmation bias caused by learning their own incorrect pseudo-labels.

Abstract

Recently, One-stage Weakly Supervised Semantic Segmentation (WSSS) with image-level labels has gained increasing interest due to simplification over its cumbersome multi-stage counterpart. Limited by the inherent ambiguity of Class Activation Map (CAM), we observe that one-stage pipelines often encounter confirmation bias caused by incorrect CAM pseudo-labels, impairing their final segmentation performance. Although recent works discard many unreliable pseudo-labels to implicitly alleviate this issue, they fail to exploit sufficient supervision for their models. To this end, we propose a dual student framework with trustworthy progressive learning (DuPL). Specifically, we propose a dual student network with a discrepancy loss to yield diverse CAMs for each sub-net. The two sub-nets generate supervision for each other, mitigating the confirmation bias caused by learning their own incorrect pseudo-labels. In this process, we progressively introduce more trustworthy pseudo-labels to be involved in the supervision through dynamic threshold adjustment with an adaptive noise filtering strategy. Moreover, we believe that every pixel, even discarded from supervision due to its unreliability, is important for WSSS. Thus, we develop consistency regularization on these discarded regions, providing supervision of every pixel. Experiment results demonstrate the superiority of the proposed DuPL over the recent state-of-the-art alternatives on PASCAL VOC 2012 and MS COCO datasets. Code is available at https://github.com/Wu0409/DuPL.

DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation

TL;DR

A dual student framework with trustworthy progressive learning (DuPL) is proposed with a discrepancy loss to yield diverse CAMs for each sub-net, mitigating the confirmation bias caused by learning their own incorrect pseudo-labels.

Abstract

Recently, One-stage Weakly Supervised Semantic Segmentation (WSSS) with image-level labels has gained increasing interest due to simplification over its cumbersome multi-stage counterpart. Limited by the inherent ambiguity of Class Activation Map (CAM), we observe that one-stage pipelines often encounter confirmation bias caused by incorrect CAM pseudo-labels, impairing their final segmentation performance. Although recent works discard many unreliable pseudo-labels to implicitly alleviate this issue, they fail to exploit sufficient supervision for their models. To this end, we propose a dual student framework with trustworthy progressive learning (DuPL). Specifically, we propose a dual student network with a discrepancy loss to yield diverse CAMs for each sub-net. The two sub-nets generate supervision for each other, mitigating the confirmation bias caused by learning their own incorrect pseudo-labels. In this process, we progressively introduce more trustworthy pseudo-labels to be involved in the supervision through dynamic threshold adjustment with an adaptive noise filtering strategy. Moreover, we believe that every pixel, even discarded from supervision due to its unreliability, is important for WSSS. Thus, we develop consistency regularization on these discarded regions, providing supervision of every pixel. Experiment results demonstrate the superiority of the proposed DuPL over the recent state-of-the-art alternatives on PASCAL VOC 2012 and MS COCO datasets. Code is available at https://github.com/Wu0409/DuPL.
Paper Structure (12 sections, 9 equations, 7 figures, 6 tables)

This paper contains 12 sections, 9 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: CAM pseudo-labels (train) vs. segmentation performance (val) on PASCAL VOC 2012. DuPL outperforms state-of-the-art one-stage competitors and achieves comparable performance with multi-stage methods in terms of CAM pseudo-labels and final segmentation performance. $\dagger$ denotes using ImageNet-21k pretrained parameters.
  • Figure 2: Confirmation bias of CAM. As training proceeds, the bias will be consistently reinforced, impairing the final segmentation performance. Here, we use the ViT-B dosovitskiy2020image baseline and introduce more unreliable pseudo-labels to amplify this phenomenon.
  • Figure 3: The overall framework of DuPL. We use a discrepancy loss $\mathcal{L}_{dis}$ to constrain the two sub-nets to generate diverse CAMs. Their CAM pseudo-labels are utilized for segmentation cross-supervision $\mathcal{L}_{seg}$, which mitigates the CAM confirmation bias. In this process, we set a dynamic threshold to progressively introduce more pixels to segmentation supervision. Adaptive Noise Filtering strategy is equipped to minimize the noise in pseudo-labels via the segmentation loss distribution. To utilize every pixel, the filtered regions are implemented consistency regularization $\mathcal{L}_{reg}$ with their perturbed counterparts. The classifier is simplified for the clear illustration.
  • Figure 4: The loss distribution of images with noisy pseudo-labels. The model produces incorrect pseudo-labels of plant. Two peaks appear in the loss distribution on the two pseudo-labels, and the red peak with anomalous losses is mainly caused by noises. The distribution of normal losses is rescaled for visualization.
  • Figure 5: Visual comparison of CAMs. We compare the state-of-the-art one-stage approach, ToCo ru2023token, with our proposed DuPL. DuPL not only suppresses over-activations but also achieves more complete object activation coverage.
  • ...and 2 more figures