Table of Contents
Fetching ...

A-BDD: Leveraging Data Augmentations for Safe Autonomous Driving in Adverse Weather and Lighting

Felix Assion, Florens Gressner, Nitin Augustine, Jona Klemenc, Ahmed Hammam, Alexandre Krattinger, Holger Trittenbach, Anja Philippsen, Sascha Riemer

TL;DR

A-BDD is presented, a large set of over 60,000 synthetically augmented images based on BDD100K that are equipped with semantic segmentation and bounding box annotations (inherited from the BDD100K dataset) to provide evidence that data augmentations can play a pivotal role in closing performance gaps in adverse weather and lighting conditions.

Abstract

High-autonomy vehicle functions rely on machine learning (ML) algorithms to understand the environment. Despite displaying remarkable performance in fair weather scenarios, perception algorithms are heavily affected by adverse weather and lighting conditions. To overcome these difficulties, ML engineers mainly rely on comprehensive real-world datasets. However, the difficulties in real-world data collection for critical areas of the operational design domain (ODD) often means synthetic data is required for perception training and safety validation. Thus, we present A-BDD, a large set of over 60,000 synthetically augmented images based on BDD100K that are equipped with semantic segmentation and bounding box annotations (inherited from the BDD100K dataset). The dataset contains augmented data for rain, fog, overcast and sunglare/shadow with varying intensity levels. We further introduce novel strategies utilizing feature-based image quality metrics like FID and CMMD, which help identify useful augmented and real-world data for ML training and testing. By conducting experiments on A-BDD, we provide evidence that data augmentations can play a pivotal role in closing performance gaps in adverse weather and lighting conditions.

A-BDD: Leveraging Data Augmentations for Safe Autonomous Driving in Adverse Weather and Lighting

TL;DR

A-BDD is presented, a large set of over 60,000 synthetically augmented images based on BDD100K that are equipped with semantic segmentation and bounding box annotations (inherited from the BDD100K dataset) to provide evidence that data augmentations can play a pivotal role in closing performance gaps in adverse weather and lighting conditions.

Abstract

High-autonomy vehicle functions rely on machine learning (ML) algorithms to understand the environment. Despite displaying remarkable performance in fair weather scenarios, perception algorithms are heavily affected by adverse weather and lighting conditions. To overcome these difficulties, ML engineers mainly rely on comprehensive real-world datasets. However, the difficulties in real-world data collection for critical areas of the operational design domain (ODD) often means synthetic data is required for perception training and safety validation. Thus, we present A-BDD, a large set of over 60,000 synthetically augmented images based on BDD100K that are equipped with semantic segmentation and bounding box annotations (inherited from the BDD100K dataset). The dataset contains augmented data for rain, fog, overcast and sunglare/shadow with varying intensity levels. We further introduce novel strategies utilizing feature-based image quality metrics like FID and CMMD, which help identify useful augmented and real-world data for ML training and testing. By conducting experiments on A-BDD, we provide evidence that data augmentations can play a pivotal role in closing performance gaps in adverse weather and lighting conditions.
Paper Structure (19 sections, 1 equation, 5 figures, 5 tables)

This paper contains 19 sections, 1 equation, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Example comparison between real-world data from BDD100K and augmented data from A-BDD. The first column presents reference fair weather images from BDD100K, while the second column shows corresponding augmented images from A-BDD. To emphasize the visual similarity to real-world rain and fog images from BDD100K, sample trigger data is included in the third column.
  • Figure 2: Comparison of unaltered image from BDD100K and augmented images from A-BDD.
  • Figure 3: Kernel Density Estimation (KDE) distributions of CLIP feature embeddings for ACDC trigger data projected with Principal Component Analysis (PCA). The CLIP feature embeddings are the basis for the CMMD calculation (see Section \ref{['sec: image_quality_metric']}).
  • Figure 4: Minimal FID/CMMD distance between augmentation subset of A-BDD, augmentation subset of Albumentations, and ACDC trigger data to BDD100K trigger data. The augmentation sets of A-BDD are significantly closer to the weather conditions of BDD100K compared to the other two datasets. In particular, we observe a notable distributional shift between the real-world trigger data from BDD100K and ACDC.
  • Figure 5: The plots show FID/CMMD distances to ACDC rain (x-axis) and corresponding mIoU results on ACDC rain after model fine-tuning (y-axis) of all $35$ augmentation sets of A-BDD. A clear negative correlation is observed between FID/CMMD distances and performance gains, highlighting the importance of feature embedding similarity for the success of model training with augmentations.