Efficient Domain Augmentation for Autonomous Driving Testing Using Diffusion Models
Luciano Baresi, Davide Yi Xian Hu, Andrea Stocco, Paolo Tonella
TL;DR
The paper addresses the limited ODD coverage in ADS simulation by integrating diffusion-model–driven domain augmentation with physics-based simulators, coupled with a semantic validator and a knowledge-distilled online renderer. It presents three augmentation strategies (Instruction-editing, Inpainting, Inpainting with Refinement), a semantic validation pipeline using OC-TSS, and a CycleGAN-based distillation approach to enable real-time rendering. Empirical results show improved ODD diversity and fault exposure, with semantic validity maintained in most augmentations (3% false positives) and realism maximized by Inpainting with Refinement; efficiency gains are achieved via knowledge distillation, reducing overhead substantially in Udacity and CARLA experiments. The work demonstrates that GenAI-augmented simulation can reveal ADS failures beyond predefined simulator scenarios, supporting more robust system-level testing and facilitating safer real-world deployment.
Abstract
Simulation-based testing is widely used to assess the reliability of Autonomous Driving Systems (ADS), but its effectiveness is limited by the operational design domain (ODD) conditions available in such simulators. To address this limitation, in this work, we explore the integration of generative artificial intelligence techniques with physics-based simulators to enhance ADS system-level testing. Our study evaluates the effectiveness and computational overhead of three generative strategies based on diffusion models, namely instruction-editing, inpainting, and inpainting with refinement. Specifically, we assess these techniques' capabilities to produce augmented simulator-generated images of driving scenarios representing new ODDs. We employ a novel automated detector for invalid inputs based on semantic segmentation to ensure semantic preservation and realism of the neural generated images. We then perform system-level testing to evaluate the ADS's generalization ability to newly synthesized ODDs. Our findings show that diffusion models help increase the ODD coverage for system-level testing of ADS. Our automated semantic validator achieved a percentage of false positives as low as 3%, retaining the correctness and quality of the generated images for testing. Our approach successfully identified new ADS system failures before real-world testing.
