Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets

Wolfgang Boettcher; Lukas Hoyer; Ozan Unal; Jan Eric Lenssen; Bernt Schiele

Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets

Wolfgang Boettcher, Lukas Hoyer, Ozan Unal, Jan Eric Lenssen, Bernt Schiele

TL;DR

This work tackles the high labeling cost of dense semantic segmentation by introducing Scribbles for All, an automatic scribble generator that converts dense annotations into scribble labels across multiple datasets. It creates four new scribble datasets (s4Pascal, s4Cityscapes, s4KITTI360, s4ADE20K) and demonstrates that state-of-the-art scribble-based methods can achieve competitive performance relative to fully supervised baselines, enabling robust benchmarking in diverse domains. The paper provides a detailed algorithm with design objectives to mimic human scribbles and validates the approach through extensive experiments, including scribble-length ablations. Overall, Scribbles for All broadens the evaluation of scribble-supervised segmentation, supports domain-adaptive research, and encourages using scribble-labeled data to leverage foundation models for practical, low-cost annotation strategies.

Abstract

In this work, we introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels. Training or fine-tuning semantic segmentation models with weak supervision has become an important topic recently and was subject to significant advances in model quality. In this setting, scribbles are a promising label type to achieve high quality segmentation results while requiring a much lower annotation effort than usual pixel-wise dense semantic segmentation annotations. The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation, which hinders the development of novel methods and conclusive evaluations. To overcome this limitation, Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations, paving the way for new insights and model advancements in the field of weakly supervised segmentation. In addition to providing datasets and algorithm, we evaluate state-of-the-art segmentation models on our datasets and show that models trained with our synthetic labels perform competitively with respect to models trained on manual labels. Thus, our datasets enable state-of-the-art research into methods for scribble-labeled semantic segmentation. The datasets, scribble generation algorithm, and baselines are publicly available at https://github.com/wbkit/Scribbles4All

Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets

TL;DR

Abstract

Paper Structure (25 sections, 7 figures, 8 tables, 1 algorithm)

This paper contains 25 sections, 7 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Automated Scribble Generation
Design Objectives
Scribble Generation Algorithm
Automatic Scribble Datasets
Experiments
Implementation Details
Baseline Scribble Datasets
Scribble Length Ablations
Limitations
Conclusion
Supplementary Material
Parameterization of the Scribble Generation Algorithm
Further Dataset Information
...and 10 more sections

Figures (7)

Figure 1: Visual difference in scribble-supervised performance. While predictions from scribble supervised models are almost identical to fully supervised models for PascalVOC, the quality of segmentation for scribble supervised Cityscapes models is visibly poorer (see dotted boxes), highlighting the greater complexity of the dataset and the need for further research.
Figure 2: Overview of label types -- Left to right: Full PascalVOC semantic label, scribble labels created by our scribble generation algorithm for s4Pascal, hand-drawn scribble labels from ScribbleSup.
Figure 3: Scribble Generation -- a) Size dependent erosion, b) COM in red, sampling of points on the edge in green, determination of the approx. farthest pair in darker green and tentative scribble in blue c) Sampling of two extra points along the tentative scribble d) Fitting final scribble through points e) Scribble overlayed on initial segmentation map.
Figure 4: Class distribution for s4Pascal and ScribbleSup -- The distribution of all object classes is very similar with the exception of the background class that is more heavily labeled in ScribbleVOC.
Figure 6: Qualitative performance on the as-datasets and ScribbleSup -- Shown are the input image overlaid with the corresponding scribbles, the EMA-model prediction and the ground truth. Colormaps can be found in the appendix.
...and 2 more figures

Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets

TL;DR

Abstract

Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets

Authors

TL;DR

Abstract

Table of Contents

Figures (7)