Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Thibaut Loiseau, Tuan-Hung Vu, Mickael Chen, Patrick Pérez, Matthieu Cord
TL;DR
The paper tackles the challenge of validating semantic segmentation under covariate shifts and unseen OOD inputs by introducing a zero-shot synthetic-data framework built on ControlNet and Stable Diffusion trained with in-domain Cityscapes data. This pipeline generates OOD-domain images and inpainted OOD objects without real OOD data, enabling robust evaluation of 40 pretrained segmenters across diverse shifts. It demonstrates strong correlations between model performance on synthetic OOD data and real OOD data, and shows synthetic data can improve calibration and OOD detection when used for testing or training. The approach offers a scalable, cost-effective avenue for virtual reliability testing in safety-critical settings and provides practical guidance on data requirements and domain coverage for effective validation.
Abstract
Assessing the robustness of perception models to covariate shifts and their ability to detect out-of-distribution (OOD) inputs is crucial for safety-critical applications such as autonomous vehicles. By nature of such applications, however, the relevant data is difficult to collect and annotate. In this paper, we show for the first time how synthetic data can be specifically generated to assess comprehensively the real-world reliability of semantic segmentation models. By fine-tuning Stable Diffusion with only in-domain data, we perform zero-shot generation of visual scenes in OOD domains or inpainted with OOD objects. This synthetic data is employed to evaluate the robustness of pretrained segmenters, thereby offering insights into their performance when confronted with real edge cases. Through extensive experiments, we demonstrate a high correlation between the performance of models when evaluated on our synthetic OOD data and when evaluated on real OOD inputs, showing the relevance of such virtual testing. Furthermore, we demonstrate how our approach can be utilized to enhance the calibration and OOD detection capabilities of segmenters. Code and data are made public.
