Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

Henry Wong; Clement Fung; Weiran Lin; Karen Li; Stanley Chen; Lujo Bauer

Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

Henry Wong, Clement Fung, Weiran Lin, Karen Li, Stanley Chen, Lujo Bauer

TL;DR

This work advances adversarial ML in autonomous driving by evaluating patches against end-to-end agents within the CARLA Leaderboard, rather than attacking isolated ML models. It demonstrates that non-ML components such as PID controllers and GPS-based rules can suppress or modify adversarial effects, particularly for steering attacks, while stopping attacks with lighting-optimized patches can generalize across agents. The authors propose a formal threat model, a patch-generation pipeline with patch projection and lighting perturbations, and a cars-level experimentation framework, revealing practical robustness gaps and the need for standardized adversarial robustness benchmarks. The work thus emphasizes holistic evaluation, reproducibility, and the potential for a community-led leaderboard to drive development of more robust autonomous driving systems with real-world applicability.

Abstract

To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom modules. Although numerous prior works have shown that adversarial examples can mislead ML models used in autonomous driving contexts, it remains unclear if these attacks are effective at producing harmful driving actions for various agents, environments, and scenarios. To assess the risk of adversarial examples to autonomous driving, we evaluate attacks against a variety of driving agents, rather than against ML models in isolation. To support this evaluation, we leverage CARLA, an urban driving simulator, to create and evaluate adversarial examples. We create adversarial patches designed to stop or steer driving agents, stream them into the CARLA simulator at runtime, and evaluate them against agents from the CARLA Leaderboard, a public repository of best-performing autonomous driving agents from an annual research competition. Unlike prior work, we evaluate attacks against autonomous driving systems without creating or modifying any driving-agent code and against all parts of the agent included with the ML model. We perform a case-study investigation of two attack strategies against three open-source driving agents from the CARLA Leaderboard across multiple driving scenarios, lighting conditions, and locations. Interestingly, we show that, although some attacks can successfully mislead ML models into predicting erroneous stopping or steering commands, some driving agents use modules, such as PID control or GPS-based rules, that can overrule attacker-manipulated predictions from ML models.

Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

TL;DR

Abstract

Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)