Unsupervised Class Generation to Expand Semantic Segmentation Datasets

Javier Montalvo; Álvaro García-Martín; Pablo Carballeira; Juan C. SanMiguel

Unsupervised Class Generation to Expand Semantic Segmentation Datasets

Javier Montalvo, Álvaro García-Martín, Pablo Carballeira, Juan C. SanMiguel

TL;DR

This work tackles the limited class vocabulary in synthetic-semantic segmentation by introducing an unsupervised, training-free pipeline that combines Stable Diffusion with Segment Anything Model to generate novel-class cutouts and precise masks. It then integrates these samples into existing datasets through a MixUp-style augmentation, enabling their use in unsupervised domain adaptation pipelines without changing segmentation architectures. The authors validate the approach on synthetic-to-real settings, achieving about 51% IoU on novel classes and observing improvements for existing classes as well. The results suggest that expanding label spaces with automatically generated, well-curated image–mask pairs can meaningfully enhance semantic segmentation performance in real-world deployments.

Abstract

Semantic segmentation is a computer vision task where classification is performed at a pixel level. Due to this, the process of labeling images for semantic segmentation is time-consuming and expensive. To mitigate this cost there has been a surge in the use of synthetically generated data -- usually created using simulators or videogames -- which, in combination with domain adaptation methods, can effectively learn how to segment real data. Still, these datasets have a particular limitation: due to their closed-set nature, it is not possible to include novel classes without modifying the tool used to generate them, which is often not public. Concurrently, generative models have made remarkable progress, particularly with the introduction of diffusion models, enabling the creation of high-quality images from text prompts without additional supervision. In this work, we propose an unsupervised pipeline that leverages Stable Diffusion and Segment Anything Module to generate class examples with an associated segmentation mask, and a method to integrate generated cutouts for novel classes in semantic segmentation datasets, all with minimal user input. Our approach aims to improve the performance of unsupervised domain adaptation methods by introducing novel samples into the training data without modifications to the underlying algorithms. With our methods, we show how models can not only effectively learn how to segment novel classes, with an average performance of 51% IoU, but also reduce errors for other, already existing classes, reaching a higher performance level overall.

Unsupervised Class Generation to Expand Semantic Segmentation Datasets

TL;DR

Abstract

Paper Structure (17 sections, 6 equations, 6 figures, 2 tables)

This paper contains 17 sections, 6 equations, 6 figures, 2 tables.

Introduction
Related Work
Generating Synthetic Data
Synthetic environments
Diffusion models
Exploiting Synthetic Data
Segment Anything Model
Method
Pipeline Definition.
Mask Curation
Including Novel Classes in Unsupervised Domain Adaptation Pipelines for Semantic Segmentation.
Experiments
Including New Classes on Datasets
Ablation Tests
Impact of Appearance Rate.
...and 2 more sections

Figures (6)

Figure 1: Schematic of our pipeline. The upper path contains the generation of the synthetic image and the lower path depicts the process of obtaining the semantic mask for the generated example.
Figure 2: Images a and b show examples of valid images and masks for the bus class; c shows what seems to be the interior of a bus, and d has a noisy mask, so both are discarded.
Figure 3: Example of our combination process.
Figure 4: Each row shows cutout examples for bus,train and truck in descending order.
Figure 5: Confusion matrices for vehicle classes before and after including the two missing classes. (a) Synthia baseline (b) Synthia + our approach, (c) CARLA-4AGT (d) CARLA-4AGT + our approach
...and 1 more figures

Unsupervised Class Generation to Expand Semantic Segmentation Datasets

TL;DR

Abstract

Unsupervised Class Generation to Expand Semantic Segmentation Datasets

Authors

TL;DR

Abstract

Table of Contents

Figures (6)