Table of Contents
Fetching ...

SemanticSpray++: A Multimodal Dataset for Autonomous Driving in Wet Surface Conditions

Aldi Piroli, Vinzenz Dallabetta, Johannes Kopp, Marc Walessa, Daniel Meissner, Klaus Dietmayer

TL;DR

SemanticSpray++ addresses the lack of publicly available multimodal data for autonomous driving under wet-surface spray by extending RoadSpray and SemanticSpray with 2D camera bounding boxes, 3D LiDAR bounding boxes, and radar semantic labels. The paper details data collection scenarios, annotation pipelines, and a supporting toolkit, then evaluates baseline 2D/3D detectors and 3D segmentation to quantify spray’s impact and the benefit of fine-tuning on spray-free data. Key contributions include multimodal labeling across camera, LiDAR, and radar, comprehensive label statistics, and open-source tooling for benchmarking. The dataset enables rigorous testing of perception systems under adverse weather, advancing robustness and real-world applicability.

Abstract

Autonomous vehicles rely on camera, LiDAR, and radar sensors to navigate the environment. Adverse weather conditions like snow, rain, and fog are known to be problematic for both camera and LiDAR-based perception systems. Currently, it is difficult to evaluate the performance of these methods due to the lack of publicly available datasets containing multimodal labeled data. To address this limitation, we propose the SemanticSpray++ dataset, which provides labels for camera, LiDAR, and radar data of highway-like scenarios in wet surface conditions. In particular, we provide 2D bounding boxes for the camera image, 3D bounding boxes for the LiDAR point cloud, and semantic labels for the radar targets. By labeling all three sensor modalities, the SemanticSpray++ dataset offers a comprehensive test bed for analyzing the performance of different perception methods when vehicles travel on wet surface conditions. Together with comprehensive label statistics, we also evaluate multiple baseline methods across different tasks and analyze their performances. The dataset will be available at https://semantic-spray-dataset.github.io .

SemanticSpray++: A Multimodal Dataset for Autonomous Driving in Wet Surface Conditions

TL;DR

SemanticSpray++ addresses the lack of publicly available multimodal data for autonomous driving under wet-surface spray by extending RoadSpray and SemanticSpray with 2D camera bounding boxes, 3D LiDAR bounding boxes, and radar semantic labels. The paper details data collection scenarios, annotation pipelines, and a supporting toolkit, then evaluates baseline 2D/3D detectors and 3D segmentation to quantify spray’s impact and the benefit of fine-tuning on spray-free data. Key contributions include multimodal labeling across camera, LiDAR, and radar, comprehensive label statistics, and open-source tooling for benchmarking. The dataset enables rigorous testing of perception systems under adverse weather, advancing robustness and real-world applicability.

Abstract

Autonomous vehicles rely on camera, LiDAR, and radar sensors to navigate the environment. Adverse weather conditions like snow, rain, and fog are known to be problematic for both camera and LiDAR-based perception systems. Currently, it is difficult to evaluate the performance of these methods due to the lack of publicly available datasets containing multimodal labeled data. To address this limitation, we propose the SemanticSpray++ dataset, which provides labels for camera, LiDAR, and radar data of highway-like scenarios in wet surface conditions. In particular, we provide 2D bounding boxes for the camera image, 3D bounding boxes for the LiDAR point cloud, and semantic labels for the radar targets. By labeling all three sensor modalities, the SemanticSpray++ dataset offers a comprehensive test bed for analyzing the performance of different perception methods when vehicles travel on wet surface conditions. Together with comprehensive label statistics, we also evaluate multiple baseline methods across different tasks and analyze their performances. The dataset will be available at https://semantic-spray-dataset.github.io .
Paper Structure (13 sections, 5 figures, 1 table)

This paper contains 13 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: The proposed SemanticSpray++ dataset offers multimodal labels across camera, LiDAR, and radar sensors for testing the effect of spray on perception systems. Top: shows the camera image with overlayed 2D ground truth bounding box (in green) of the vehicle in front. Bottom-left: shows the captured LiDAR scan, where the 3D ground truth bounding box (in green) represents the leading vehicle. Additionally, each point has an associated semantic label, where the colors represent $\color{SS-background}{\bullet}~$background$\color{SS-car}{\bullet}~$foreground$\color{SS-spray}{\bullet}~$noise. Bottom-right: shows the radar target represented by the Doppler velocity vector (green arrow). We also overlay the LiDAR scan for visualization purposes in gray.
  • Figure 2: Overview of some of the scenes present in the proposed dataset. Top row: show the occlusion effect caused by the windshield wipers. Middle row: shows the blurriness effect caused by the spray particles generated by the leading vehicle. Bottom row: shows how sunlight directly reflecting off the camera sensor or on the wet surface leads to locally overexposed images, which block the leading vehicle from the field of view. We show with green boxes the provided 2D box annotations.
  • Figure 3: Statistics of the proposed dataset. Top-left: shows the distributions of the LiDAR-based semantic labels among different velocities. Top-right: shows the distributions of the radar-based semantic labels among different velocities. Bottom-left: shows the number of 2D and 3D object box annotations in the camera and LiDAR point clouds. The number of boxes matches among the different modalities as both sensors always capture the leading vehicle. Bottom-right: shows the number of vehicle points in the 3D LiDAR bounding boxes at different speeds.
  • Figure 4: Qualitative results for 2D and 3D object detectors tested on SemanticSpray++. Top row: shows the camera image with overlayed ground truth bounding boxes $\color{green}{\mathbf{-}}$, predictions using YOLOv8m $\color{red}{\mathbf{--}}$, and predictions using predictions using YOLOv8m + BoT-SORT $\color{blue}{\mathbf{--}}$. Bottom row: shows the LiDAR point could with ground truth boxes $\color{green}{\mathbf{-}}$, predictions from SECOND trained only on nuScenes $\color{red}{\mathbf{-}}$, and SECOND trained on nuScenes with additional fine-tuning on SemanticSpray++ $\color{blue}{\mathbf{-}}$. The semantic labels for the LiDAR point cloud have the following color map: $\color{SS-background}{\bullet}~$background$\color{SS-car}{\bullet}~$foreground$\color{SS-spray}{\bullet}~$noise.
  • Figure 5: Confusion matrix of SPVCNN trained on the nuScenes-semantic and SemanticKITTI datasets and evaluated on the SemanticSpray++ dataset. Notice that, as the training labels do not match the test labels, the matrices are not square. Additionally, small values ($<0.01$) are truncated to $0$ for visualization purposes.