Benchmarking the Robustness of Panoptic Segmentation for Automated Driving

Yiting Wang; Haonan Zhao; Daniel Gummadi; Mehrdad Dianati; Kurt Debattista; Valentina Donzella

Benchmarking the Robustness of Panoptic Segmentation for Automated Driving

Yiting Wang, Haonan Zhao, Daniel Gummadi, Mehrdad Dianati, Kurt Debattista, Valentina Donzella

TL;DR

This work tackles the robustness of panoptic segmentation for automated driving under diverse camera degradations by introducing a unifying degradation-impact pipeline and a synthetic, balanced D-Cityscapes+ dataset with 19 degradation factors across 3 severity levels. It evaluates three state-of-the-art panoptic models (including CNN- and ViT-based backbones) using eight image-quality metrics and the panoptic quality metric PQ, detailing how degradation factors influence perception performance. Key findings show that Gaussian noise and droplets on the lens most degrade PQ, ViT-based backbones offer superior robustness, and metrics like CW-SSIM and LPIPS strongly predict panoptic performance, enabling predictive assessment and design guidance for AAD systems. The framework provides a practical, data-driven basis for robustness benchmarking and sensor-quality planning in automated driving.

Abstract

Precise situational awareness is required for the safe decision-making of assisted and automated driving (AAD) functions. Panoptic segmentation is a promising perception technique to identify and categorise objects, impending hazards, and driveable space at a pixel level. While segmentation quality is generally associated with the quality of the camera data, a comprehensive understanding and modelling of this relationship are paramount for AAD system designers. Motivated by such a need, this work proposes a unifying pipeline to assess the robustness of panoptic segmentation models for AAD, correlating it with traditional image quality. The first step of the proposed pipeline involves generating degraded camera data that reflects real-world noise factors. To this end, 19 noise factors have been identified and implemented with 3 severity levels. Of these factors, this work proposes novel models for unfavourable light and snow. After applying the degradation models, three state-of-the-art CNN- and vision transformers (ViT)-based panoptic segmentation networks are used to analyse their robustness. The variations of the segmentation performance are then correlated to 8 selected image quality metrics. This research reveals that: 1) certain specific noise factors produce the highest impact on panoptic segmentation, i.e. droplets on lens and Gaussian noise; 2) the ViT-based panoptic segmentation backbones show better robustness to the considered noise factors; 3) some image quality metrics (i.e. LPIPS and CW-SSIM) correlate strongly with panoptic segmentation performance and therefore they can be used as predictive metrics for network performance.

Benchmarking the Robustness of Panoptic Segmentation for Automated Driving

TL;DR

Abstract

Paper Structure (20 sections, 3 equations, 9 figures, 5 tables)

This paper contains 20 sections, 3 equations, 9 figures, 5 tables.

Introduction
Related Work
Impact of Camera Data Degradation
Driving Datasets embedding Degradation Factors
Panoptic Segmentation
Analysis of Selected Noise Factors
Identified Degradation Factors
Proposed Pipeline
Dataset
Selected Panoptic Segmentation Models.
Evaluation Metrics
Image Quality Analysis
Panoptic Robustness Evaluation Metrics
Correlation Metrics
Experimental Results and Analysis
...and 5 more sections

Figures (9)

Figure 1: Visual examples of the newly proposed Degraded-Cityscapes plus (D-Cityscapes+) with 19 types of degradation, from the top to the bottom, are categorised as unfavourable light, adverse weather, internal sensor noises, motion blur and distortion artefacts.
Figure 2: The proposed unifying degradation impact pipeline consists of applying the noise factors to the clean dataset, panoptic segmentation, and evaluation. The 19 types of noise factors are included within the blue dotted box.
Figure 3: Comparison of the adverse weather simulation used from (a) dong2023benchmarking(CVPR'2023) (directly using image editing tool, imgaug imgaug) and (b) Ours (physical-based simulation methods referenced or modified from tremblay2021rainsakaridis2018modelzhang2021deep). We can see that: (1) The distribution of the fog and rain from (a) does not follow the physical rules with the depth information (2) The snow from (a) is evenly distributed with similar sizes, while ours mimics the random distribution and different sizes and directions of the snowflakes with a veiling effect.
Figure 4: The visual examples of the different snow images, from left to right: (a) Synthetic extreme snow without veiling effect from snow cityscapes zhang2021deep (b) and (c) are real-world snow images from chen2020jstasrcaesar2020nuscenes, where the clear veiling effect with some mist-like mask can be observed.
Figure 5: The overview of the selected panoptic segmentation models, (a) EffcientPS mohan2021efficientps (b) DeepLab cheng2020panoptic (c) Oneformerjain2023oneformer.
...and 4 more figures

Benchmarking the Robustness of Panoptic Segmentation for Automated Driving

TL;DR

Abstract

Benchmarking the Robustness of Panoptic Segmentation for Automated Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (9)