Table of Contents
Fetching ...

Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

Henry Wong, Clement Fung, Weiran Lin, Karen Li, Stanley Chen, Lujo Bauer

TL;DR

This work advances adversarial ML in autonomous driving by evaluating patches against end-to-end agents within the CARLA Leaderboard, rather than attacking isolated ML models. It demonstrates that non-ML components such as PID controllers and GPS-based rules can suppress or modify adversarial effects, particularly for steering attacks, while stopping attacks with lighting-optimized patches can generalize across agents. The authors propose a formal threat model, a patch-generation pipeline with patch projection and lighting perturbations, and a cars-level experimentation framework, revealing practical robustness gaps and the need for standardized adversarial robustness benchmarks. The work thus emphasizes holistic evaluation, reproducibility, and the potential for a community-led leaderboard to drive development of more robust autonomous driving systems with real-world applicability.

Abstract

To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom modules. Although numerous prior works have shown that adversarial examples can mislead ML models used in autonomous driving contexts, it remains unclear if these attacks are effective at producing harmful driving actions for various agents, environments, and scenarios. To assess the risk of adversarial examples to autonomous driving, we evaluate attacks against a variety of driving agents, rather than against ML models in isolation. To support this evaluation, we leverage CARLA, an urban driving simulator, to create and evaluate adversarial examples. We create adversarial patches designed to stop or steer driving agents, stream them into the CARLA simulator at runtime, and evaluate them against agents from the CARLA Leaderboard, a public repository of best-performing autonomous driving agents from an annual research competition. Unlike prior work, we evaluate attacks against autonomous driving systems without creating or modifying any driving-agent code and against all parts of the agent included with the ML model. We perform a case-study investigation of two attack strategies against three open-source driving agents from the CARLA Leaderboard across multiple driving scenarios, lighting conditions, and locations. Interestingly, we show that, although some attacks can successfully mislead ML models into predicting erroneous stopping or steering commands, some driving agents use modules, such as PID control or GPS-based rules, that can overrule attacker-manipulated predictions from ML models.

Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

TL;DR

This work advances adversarial ML in autonomous driving by evaluating patches against end-to-end agents within the CARLA Leaderboard, rather than attacking isolated ML models. It demonstrates that non-ML components such as PID controllers and GPS-based rules can suppress or modify adversarial effects, particularly for steering attacks, while stopping attacks with lighting-optimized patches can generalize across agents. The authors propose a formal threat model, a patch-generation pipeline with patch projection and lighting perturbations, and a cars-level experimentation framework, revealing practical robustness gaps and the need for standardized adversarial robustness benchmarks. The work thus emphasizes holistic evaluation, reproducibility, and the potential for a community-led leaderboard to drive development of more robust autonomous driving systems with real-world applicability.

Abstract

To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom modules. Although numerous prior works have shown that adversarial examples can mislead ML models used in autonomous driving contexts, it remains unclear if these attacks are effective at producing harmful driving actions for various agents, environments, and scenarios. To assess the risk of adversarial examples to autonomous driving, we evaluate attacks against a variety of driving agents, rather than against ML models in isolation. To support this evaluation, we leverage CARLA, an urban driving simulator, to create and evaluate adversarial examples. We create adversarial patches designed to stop or steer driving agents, stream them into the CARLA simulator at runtime, and evaluate them against agents from the CARLA Leaderboard, a public repository of best-performing autonomous driving agents from an annual research competition. Unlike prior work, we evaluate attacks against autonomous driving systems without creating or modifying any driving-agent code and against all parts of the agent included with the ML model. We perform a case-study investigation of two attack strategies against three open-source driving agents from the CARLA Leaderboard across multiple driving scenarios, lighting conditions, and locations. Interestingly, we show that, although some attacks can successfully mislead ML models into predicting erroneous stopping or steering commands, some driving agents use modules, such as PID control or GPS-based rules, that can overrule attacker-manipulated predictions from ML models.

Paper Structure

This paper contains 32 sections, 2 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: We show the various parts of autonomous driving agents based on the CARLA Leaderboard API. These parts include ML models (in green) and other modules used prior to ML model inputs or after ML model outputs (shown in blue). We evaluate agents that use RGB images for model input and predict waypoints as model output (shown in solid lines); however, we show other commonly used input and output modalities in this diagram (shown in dotted lines). In red, we show our process for inserting adversarial examples into CARLA.
  • Figure 2: We annotate a map of Town 02 in CARLA, listing all locations where a curved road segment, a straight road segment, or an intersection are found. We number each of these instances with a unique identifier.
  • Figure 3: We generate an adversarial patch that successfully misleads the TCP agent to stop the vehicle, resulting in a route completion failure on the CARLA Leaderboard.
  • Figure 4: We show the speed of the target vehicle when performing our stopping attack against the Rails agent (top) and the NEAT agent (bottom). In both cases, the adversarial patch successfully causes the vehicle to stop, resulting in a route completion failure.
  • Figure 5: We perform steering attacks against TCP on Road #5 (b) and Turn #2 (c), and we compare the ML model's predicted aim angle (top) to the vehicle's steering command (bottom). In both locations, the adversarial patches manipulate the aim angle (red, top) but not the steering commands (red, bottom). We manually disable TCP's agent-specific rules (black), and find that only then do our attacks succeed.
  • ...and 2 more figures