Table of Contents
Fetching ...

Learning to Generate Training Datasets for Robust Semantic Segmentation

Marwane Hariat, Olivier Laurent, Rémi Kazmierczak, Shihao Zhang, Andrei Bursuc, Angela Yao, Gianni Franchi

TL;DR

This work designs Robusta, a novel robust conditional generative adversarial network to generate realistic and plausible perturbed images that can be used to train reliable segmentation models by leveraging the synergy between label-to-image generators and image-to-label segmentation models.

Abstract

Semantic segmentation methods have advanced significantly. Still, their robustness to real-world perturbations and object types not seen during training remains a challenge, particularly in safety-critical applications. We propose a novel approach to improve the robustness of semantic segmentation techniques by leveraging the synergy between label-to-image generators and image-to-label segmentation models. Specifically, we design Robusta, a novel robust conditional generative adversarial network to generate realistic and plausible perturbed images that can be used to train reliable segmentation models. We conduct in-depth studies of the proposed generative model, assess the performance and robustness of the downstream segmentation network, and demonstrate that our approach can significantly enhance the robustness in the face of real-world perturbations, distribution shifts, and out-of-distribution samples. Our results suggest that this approach could be valuable in safety-critical applications, where the reliability of perception modules such as semantic segmentation is of utmost importance and comes with a limited computational budget in inference. We release our code at https://github.com/ENSTA-U2IS-AI/robusta.

Learning to Generate Training Datasets for Robust Semantic Segmentation

TL;DR

This work designs Robusta, a novel robust conditional generative adversarial network to generate realistic and plausible perturbed images that can be used to train reliable segmentation models by leveraging the synergy between label-to-image generators and image-to-label segmentation models.

Abstract

Semantic segmentation methods have advanced significantly. Still, their robustness to real-world perturbations and object types not seen during training remains a challenge, particularly in safety-critical applications. We propose a novel approach to improve the robustness of semantic segmentation techniques by leveraging the synergy between label-to-image generators and image-to-label segmentation models. Specifically, we design Robusta, a novel robust conditional generative adversarial network to generate realistic and plausible perturbed images that can be used to train reliable segmentation models. We conduct in-depth studies of the proposed generative model, assess the performance and robustness of the downstream segmentation network, and demonstrate that our approach can significantly enhance the robustness in the face of real-world perturbations, distribution shifts, and out-of-distribution samples. Our results suggest that this approach could be valuable in safety-critical applications, where the reliability of perception modules such as semantic segmentation is of utmost importance and comes with a limited computational budget in inference. We release our code at https://github.com/ENSTA-U2IS-AI/robusta.
Paper Structure (58 sections, 15 equations, 14 figures, 15 tables)

This paper contains 58 sections, 15 equations, 14 figures, 15 tables.

Figures (14)

  • Figure 0: Illustration of image synthesis models under different perturbations. Compared to previous work sushko2021oasis, our results express more natural textures and details, even under anomalies or shifts in the input distribution.
  • Figure 1: Illustration of the pipeline. The pipeline diagram depicts three steps. Firstly, we train the new Robusta model. Secondly, we utilize Robusta to create a diverse, high-quality dataset that includes various objects of interest. Finally, we train a segmentation model on this augmented dataset.
  • Figure 2: Illustration of Robusta's generation process. First, we train the networks $G_{\text{coarse}}$, producing low-resolution images. We add another generator, $G_{\text{fine}}$, to improve the image quality from the output of $G_{\text{coarse}}$. The + operation corresponds to a concatenation.
  • Figure 3: Perturbed label maps. Different label maps from Corrupted-Cityscapes (top) and Outlier-Cityscapes (bottom).
  • Figure S4: Illustration of the first generator of Robusta, $G_{\text{coarse}}$. The green blocks are detailed in Table \ref{['tab:robusta_details']}.
  • ...and 9 more figures