Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems

Mohammad Hossein Amini; Shiva Nejati

Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems

Mohammad Hossein Amini, Shiva Nejati

TL;DR

This work tackles the mismatch between real-world training data and simulator test images for autonomous driving DNNs by evaluating three domain-to-domain translators (CycleGAN, neural style transfer, SAEVAE). It assesses their impact on offline and online testing across lane keeping and object detection, using a rigorous set of data-quality and fault-revealing metrics. The findings show translators, particularly SAEVAE, substantially bridge distribution gaps, improve offline and online fault detection, preserve test-data quality, and impose minimal online-time overhead, while also increasing the correlation between offline and online results. These results support integrating SAEVAE into ADS testing workflows to achieve more reliable, scalable, and cost-effective testing, and the authors provide replication materials to enable broader reuse.

Abstract

Deep Neural Networks (DNNs) for Autonomous Driving Systems (ADS) are typically trained on real-world images and tested using synthetic simulator images. This approach results in training and test datasets with dissimilar distributions, which can potentially lead to erroneously decreased test accuracy. To address this issue, the literature suggests applying domain-to-domain translators to test datasets to bring them closer to the training datasets. However, translating images used for testing may unpredictably affect the reliability, effectiveness and efficiency of the testing process. Hence, this paper investigates the following questions in the context of ADS: Could translators reduce the effectiveness of images used for ADS-DNN testing and their ability to reveal faults in ADS-DNNs? Can translators result in excessive time overhead during simulation-based testing? To address these questions, we consider three domain-to-domain translators: CycleGAN and neural style transfer, from the literature, and SAEVAE, our proposed translator. Our results for two critical ADS tasks -- lane keeping and object detection -- indicate that translators significantly narrow the gap in ADS test accuracy caused by distribution dissimilarities between training and test data, with SAEVAE outperforming the other two translators. We show that, based on the recent diversity, coverage, and fault-revealing ability metrics for testing deep-learning systems, translators do not compromise the diversity and the coverage of test data, nor do they lead to revealing fewer faults in ADS-DNNs. Further, among the translators considered, SAEVAE incurs a negligible overhead in simulation time and can be efficiently integrated into simulation-based testing. Finally, we show that translators increase the correlation between offline and simulation-based testing results, which can help reduce the cost of simulation-based testing.

Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems

TL;DR

Abstract

Paper Structure (20 sections, 1 equation, 6 figures, 10 tables)

This paper contains 20 sections, 1 equation, 6 figures, 10 tables.

Introduction
Image-to-Image Translation
Online Testing with Translators
Evaluation Setup
Offline Datasets and ADS-DNNs
Training ADS-DNNs and Translators
Setup for Online ADS Testing
Test Data Quality Metrics
Results
RQ1: Test data distribution gap mitigation
RQ2: Offline testing accuracy gap mitigation
RQ3: Online testing failure reduction
RQ4: Test data quality preservation
RQ5: Translators' time overhead
RQ6: Online vs offline results correlation
...and 5 more sections

Figures (6)

Figure 1: Three sample images for ADS: (a) an image from a real-world dataset used to train an ADS-DNN; (b) a simulator-generated image for ADS-DNN testing; and (c) the transformation of image (b) using the SAEVAE translator
Figure 2: The training process for our SAEVAE translators
Figure 3: Extending online testing for ADS with translators: (a) The embedding of translators into the online testing loop; (b) prerequisite datasets for training ADS-DNNs and translators; and (c) overview of concepts in the ADS domain
Figure 4: Reconstruction-error distributions obtained by a VAE trained on $D_{\mathit{real}}$ for lane-keeping and object-detection tasks. The error distributions are shown for $D_{\mathit{real}}$, $D_{\mathit{sim}}$, and the translations of $D_{\mathit{sim}}$ by SAEVAE, cycleG and styleT.
Figure 5: MAE results of the lane-keeping ADS-DNNs for the real-world ($D_{\mathit{real, test}}$) and synthetic ($D_{\mathit{sim}}$) datasets, and for the translations of $D_{\mathit{sim}}$ obtained by SAEVAE, cycleG and styleT
...and 1 more figures

Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems

TL;DR

Abstract

Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (6)