SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

Harsh Goel; Sai Shankar Narasimhan; Oguzhan Akcin; Sandeep Chinchali

SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

Harsh Goel, Sai Shankar Narasimhan, Oguzhan Akcin, Sandeep Chinchali

TL;DR

SynDiff-AD targets the pervasive problem of under-represented driving conditions in semantic segmentation and end-to-end autonomous driving by generating semantically consistent, subgroup-specific synthetic data. It combines latent diffusion models with ControlNet conditioned on semantic maps and a novel CaG prompting scheme to produce images that balance datasets across weather and lighting subgroups, without additional labeling. Empirical results on Waymo, DeepDrive, and CARLA show improvements in segmentation metrics (up to 2.3 mIoU) and driving performance (up to ~20% DS) across diverse conditions, with ablations confirming the value of CaG in enhancing synthetic data quality. The approach reduces labeling costs and enhances model robustness, though it remains limited to single-view data and does not explore adversarial or multi-view data generation for further gains.

Abstract

In recent years, significant progress has been made in collecting large-scale datasets to improve segmentation and autonomous driving models. These large-scale datasets are often dominated by common environmental conditions such as "Clear and Day" weather, leading to decreased performance in under-represented conditions like "Rainy and Night". To address this issue, we introduce SynDiff-AD, a novel data augmentation pipeline that leverages diffusion models (DMs) to generate realistic images for such subgroups. SynDiff-AD uses ControlNet-a DM that guides data generation conditioned on semantic maps-along with a novel prompting scheme that generates subgroup-specific, semantically dense prompts. By augmenting datasets with SynDiff-AD, we improve the performance of segmentation models like Mask2Former and SegFormer by up to 1.2% and 2.3% on the Waymo dataset, and up to 1.4% and 0.7% on the DeepDrive dataset, respectively. Additionally, we demonstrate that our SynDiff-AD pipeline enhances the driving performance of end-to-end autonomous driving models, like AIM-2D and AIM-BEV, by up to 20% across diverse environmental conditions in the CARLA autonomous driving simulator, providing a more robust model. We release our code and pipeline at https://github.com/UTAustin-SwarmLab/SynDiff-AD.

SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

TL;DR

Abstract

SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)