Exploiting Inherent Class Label: Towards Robust Scribble Supervised Semantic Segmentation
Xinliang Zhang, Lei Zhu, Shuang Zeng, Hangzhou He, Ourui Fu, Zhengjian Yao, Zhaoheng Xie, Yanye Lu
TL;DR
This work tackles scribble-based weakly supervised semantic segmentation by introducing CSPNet, which leverages inherent class labels to generate robust pseudo-labels without overreliance on noisy scribble-driven predictions. Key innovations include the localization rectification module (LoRM) for rectifying misled foreground representations and the distance perception module (DPM) to identify reliable regions around scribble and pseudo-label boundaries. A dedicated scribble simulation algorithm and two large-scale benchmarks, ScribbleCOCO and ScribbleCityscapes, enable robust evaluation across diverse scribble styles. Empirical results show state-of-the-art performance and strong robustness to scribble variability, with public release of code and datasets ahead of broader adoption in the SSSS community.
Abstract
Scribble-based weakly supervised semantic segmentation leverages only a few annotated pixels as labels to train a segmentation model, presenting significant potential for reducing the human labor involved in the annotation process. This approach faces two primary challenges: first, the sparsity of scribble annotations can lead to inconsistent predictions due to limited supervision; second, the variability in scribble annotations, reflecting differing human annotator preferences, can prevent the model from consistently capturing the discriminative regions of objects, potentially leading to unstable predictions. To address these issues, we propose a holistic framework, the class-driven scribble promotion network, for robust scribble-supervised semantic segmentation. This framework not only utilizes the provided scribble annotations but also leverages their associated class labels to generate reliable pseudo-labels. Within the network, we introduce a localization rectification module to mitigate noisy labels and a distance perception module to identify reliable regions surrounding scribble annotations and pseudo-labels. In addition, we introduce new large-scale benchmarks, ScribbleCOCO and ScribbleCityscapes, accompanied by a scribble simulation algorithm that enables evaluation across varying scribble styles. Our method demonstrates competitive performance in both accuracy and robustness, underscoring its superiority over existing approaches. The datasets and the codes will be made publicly available.
