S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving

Daniel Omeiza; Pratik Somaiya; Jo-Ann Pattinson; Carolyn Ten-Holter; Jack Stilgoe; Marina Jirotka; Lars Kunze

S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving

Daniel Omeiza, Pratik Somaiya, Jo-Ann Pattinson, Carolyn Ten-Holter, Jack Stilgoe, Marina Jirotka, Lars Kunze

TL;DR

S-RAF introduces a simulation-based framework to quantify robustness and environmental impact of autonomous driving agents using the CARLA platform. It defines concrete robustness indicators across environmental disturbances, sensor faults, and corner cases, and couples these with an inference-time $CO_2$ emissions metric derived from carbon intensity and energy use. Through experiments with three state-of-the-art agents, the paper demonstrates increasing robustness over time but rising emissions, highlighting trade-offs between safety performance and sustainability. The framework aims to support safer, more responsible AD development and facilitate certification by providing transparent, reproducible robustness and emissions metrics computed in simulation.

Abstract

As artificial intelligence (AI) technology advances, ensuring the robustness and safety of AI-driven systems has become paramount. However, varying perceptions of robustness among AI developers create misaligned evaluation metrics, complicating the assessment and certification of safety-critical and complex AI systems such as autonomous driving (AD) agents. To address this challenge, we introduce Simulation-Based Robustness Assessment Framework (S-RAF) for autonomous driving. S-RAF leverages the CARLA Driving simulator to rigorously assess AD agents across diverse conditions, including faulty sensors, environmental changes, and complex traffic situations. By quantifying robustness and its relationship with other safety-critical factors, such as carbon emissions, S-RAF aids developers and stakeholders in building safe and responsible driving agents, and streamlining safety certification processes. Furthermore, S-RAF offers significant advantages, such as reduced testing costs, and the ability to explore edge cases that may be unsafe to test in the real world. The code for this framework is available here: https://github.com/cognitive-robots/rai-leaderboard

S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving

TL;DR

emissions metric derived from carbon intensity and energy use. Through experiments with three state-of-the-art agents, the paper demonstrates increasing robustness over time but rising emissions, highlighting trade-offs between safety performance and sustainability. The framework aims to support safer, more responsible AD development and facilitate certification by providing transparent, reproducible robustness and emissions metrics computed in simulation.

Abstract

Paper Structure (31 sections, 4 equations, 4 figures, 2 tables)

This paper contains 31 sections, 4 equations, 4 figures, 2 tables.

Introduction
The Need for RAI Indicators
Robustness
Carbon Emission
Previous RAI Efforts
RAI Frameworks
RAI Tools
S-RAF: Robustness Indicators
Robustness against Environmental Disturbances
i. Camera Occlusion
ii. LiDAR Occlusion
iii. Weather Disturbances
Robustness against Sensor Errors
i. Camera Error
ii. LiDAR Error
...and 16 more sections

Figures (4)

Figure 1: Overview of S-RAF. Trained agents/models from the ML trials are selected and passed to S-RAF for comprehensive robustness and $\text{CO}_2$ emission assessment, and are ranked accordingly.
Figure 2: The first row shows samples of camera occlusion (Fig. 2a) and LiDAR occlusion (normal: Fig. 2b, occluded: Fig. 2c). 2nd row shows faulty sensors for a camera (Fig. 2d) and for LiDAR (normal: Fig. 2e, faulty with missing rays: Fig. 2f). Third row shows a rainy weather (2g), a jaywalker (Fig. 2h), and data drift example, a chaotic scene (Fig. 2i).
Figure 3: All agents' performance dropped with disturbances. CARLA's NPC Agent was excluded as it doesn't rely on sensor data.
Figure 4: We observe increased emissions as the agents get more robust over the years, with NPC Agent constituting significantly low emissions.

S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving

TL;DR

Abstract

S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (4)