Table of Contents
Fetching ...

PrevMatch: Revisiting and Maximizing Temporal Knowledge in Semi-Supervised Semantic Segmentation

Wooseok Shin, Hyun Joon Park, Jin Sob Kim, Juan Yun, Se Hong Park, Sung Won Han

TL;DR

PrevMatch revisits temporal knowledge in semi-supervised semantic segmentation by reusing past models as additional, stochastic pseudo-label guidance and fusing it with standard self-training. The approach employs a highly randomized ensemble of saved models to maximize coverage of reliable views while maintaining efficiency, avoiding heavy dual-EMA or co-training architectures. Empirical results on Pascal VOC, Cityscapes, and ADE20K show consistent improvements across label partitions and protocols, along with enhanced training stability and generalization to unseen data. The method offers a practical, plug-in enhancement that scales with existing baselines and reduces the computational burden of achieving robust semi-supervised performance.

Abstract

In semi-supervised semantic segmentation, the Mean Teacher- and co-training-based approaches are employed to mitigate confirmation bias and coupling problems. However, despite their high performance, these approaches frequently involve complex training pipelines and a substantial computational burden, limiting the scalability and compatibility of these methods. In this paper, we propose a PrevMatch framework that effectively mitigates the aforementioned limitations by maximizing the utilization of the temporal knowledge obtained during the training process. The PrevMatch framework relies on two core strategies: (1) we reconsider the use of temporal knowledge and thus directly utilize previous models obtained during training to generate additional pseudo-label guidance, referred to as previous guidance. (2) we design a highly randomized ensemble strategy to maximize the effectiveness of the previous guidance. PrevMatch, a simple yet effective plug-in method, can be seamlessly integrated into existing semi-supervised learning frameworks with minimal computational overhead. Experimental results on three benchmark semantic segmentation datasets show that incorporating PrevMatch into existing methods significantly improves their performance. Furthermore, our analysis indicates that PrevMatch facilitates stable optimization during training, resulting in improved generalization performance.

PrevMatch: Revisiting and Maximizing Temporal Knowledge in Semi-Supervised Semantic Segmentation

TL;DR

PrevMatch revisits temporal knowledge in semi-supervised semantic segmentation by reusing past models as additional, stochastic pseudo-label guidance and fusing it with standard self-training. The approach employs a highly randomized ensemble of saved models to maximize coverage of reliable views while maintaining efficiency, avoiding heavy dual-EMA or co-training architectures. Empirical results on Pascal VOC, Cityscapes, and ADE20K show consistent improvements across label partitions and protocols, along with enhanced training stability and generalization to unseen data. The method offers a practical, plug-in enhancement that scales with existing baselines and reduces the computational burden of achieving robust semi-supervised performance.

Abstract

In semi-supervised semantic segmentation, the Mean Teacher- and co-training-based approaches are employed to mitigate confirmation bias and coupling problems. However, despite their high performance, these approaches frequently involve complex training pipelines and a substantial computational burden, limiting the scalability and compatibility of these methods. In this paper, we propose a PrevMatch framework that effectively mitigates the aforementioned limitations by maximizing the utilization of the temporal knowledge obtained during the training process. The PrevMatch framework relies on two core strategies: (1) we reconsider the use of temporal knowledge and thus directly utilize previous models obtained during training to generate additional pseudo-label guidance, referred to as previous guidance. (2) we design a highly randomized ensemble strategy to maximize the effectiveness of the previous guidance. PrevMatch, a simple yet effective plug-in method, can be seamlessly integrated into existing semi-supervised learning frameworks with minimal computational overhead. Experimental results on three benchmark semantic segmentation datasets show that incorporating PrevMatch into existing methods significantly improves their performance. Furthermore, our analysis indicates that PrevMatch facilitates stable optimization during training, resulting in improved generalization performance.
Paper Structure (30 sections, 2 equations, 4 figures, 16 tables)

This paper contains 30 sections, 2 equations, 4 figures, 16 tables.

Figures (4)

  • Figure 1: Illustration of the frameworks for (a) FixMatch sohn2020fixmatch, (b) Mean Teacher-based structure tarvainen2017mean, (c) Co-training chen2021semili2023diverse (cps: cross pseudo supervision), (d) Dual Mean Teacher liu2022perturbedna2023switching, and (e) the proposed method. In (d), the inputs ($x^{s_{cut}}, x^{s_{class}}$) indicate the CutMix yun2019cutmix and ClassMix olsson2021classmix augmentations used in Dual Teacher na2023switching.
  • Figure 2: Training curves for the chair and sofa classes, illustrating variations in pseudo-label pixel accuracy and validation IoU scores (on Pascal VOC 92-label partition).
  • Figure 3: Qualitative segmentation results on (a) Pascal VOC and (b) Cityscapes.
  • Figure 4: Training curves for different label partition settings on Pascal VOC. The X- and Y-axes represent epochs and validation mIoU, respectively. The square symbol ($\blacksquare$) denotes the epoch with the best performance.