Table of Contents
Fetching ...

Temporal Logic-Based Multi-Vehicle Backdoor Attacks against Offline RL Agents in End-to-end Autonomous Driving

Xuan Chen, Shiwei Feng, Zikang Xiong, Shengwei An, Yunshu Mao, Lu Yan, Guanhong Tao, Wenbo Guo, Xiangyu Zhang

TL;DR

The paper demonstrates a novel trajectory-based backdoor attack on end-to-end autonomous driving systems by using coordinated attacker-vehicle trajectories as triggers, controlled through a temporal logic framework. It automates trigger generation with behavior models and TL specifications, and strengthens stealth with a negative training strategy that uses patch trajectories. Evaluations across multiple offline RL agents and trigger patterns show the attack's effectiveness and highlight deficiencies in existing defenses. The work emphasizes both practical security risks in AD and the need for robust defense mechanisms and broader testing methodologies.

Abstract

Assessing the safety of autonomous driving (AD) systems against security threats, particularly backdoor attacks, is a stepping stone for real-world deployment. However, existing works mainly focus on pixel-level triggers that are impractical to deploy in the real world. We address this gap by introducing a novel backdoor attack against the end-to-end AD systems that leverage one or more other vehicles' trajectories as triggers. To generate precise trigger trajectories, we first use temporal logic (TL) specifications to define the behaviors of attacker vehicles. Configurable behavior models are then used to generate these trajectories, which are quantitatively evaluated and iteratively refined based on the TL specifications. We further develop a negative training strategy by incorporating patch trajectories that are similar to triggers but are designated not to activate the backdoor. It enhances the stealthiness of the attack and refines the system's responses to trigger scenarios. Through extensive experiments on 5 offline reinforcement learning (RL) driving agents with 6 trigger patterns and target action combinations, we demonstrate the flexibility and effectiveness of our proposed attack, showing the under-exploration of existing end-to-end AD systems' vulnerabilities to such trajectory-based backdoor attacks.

Temporal Logic-Based Multi-Vehicle Backdoor Attacks against Offline RL Agents in End-to-end Autonomous Driving

TL;DR

The paper demonstrates a novel trajectory-based backdoor attack on end-to-end autonomous driving systems by using coordinated attacker-vehicle trajectories as triggers, controlled through a temporal logic framework. It automates trigger generation with behavior models and TL specifications, and strengthens stealth with a negative training strategy that uses patch trajectories. Evaluations across multiple offline RL agents and trigger patterns show the attack's effectiveness and highlight deficiencies in existing defenses. The work emphasizes both practical security risks in AD and the need for robust defense mechanisms and broader testing methodologies.

Abstract

Assessing the safety of autonomous driving (AD) systems against security threats, particularly backdoor attacks, is a stepping stone for real-world deployment. However, existing works mainly focus on pixel-level triggers that are impractical to deploy in the real world. We address this gap by introducing a novel backdoor attack against the end-to-end AD systems that leverage one or more other vehicles' trajectories as triggers. To generate precise trigger trajectories, we first use temporal logic (TL) specifications to define the behaviors of attacker vehicles. Configurable behavior models are then used to generate these trajectories, which are quantitatively evaluated and iteratively refined based on the TL specifications. We further develop a negative training strategy by incorporating patch trajectories that are similar to triggers but are designated not to activate the backdoor. It enhances the stealthiness of the attack and refines the system's responses to trigger scenarios. Through extensive experiments on 5 offline reinforcement learning (RL) driving agents with 6 trigger patterns and target action combinations, we demonstrate the flexibility and effectiveness of our proposed attack, showing the under-exploration of existing end-to-end AD systems' vulnerabilities to such trajectory-based backdoor attacks.

Paper Structure

This paper contains 28 sections, 1 equation, 7 figures, 12 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of our proposed backdoor attack.
  • Figure 2: Overview of our attack. Phase I: the attacker selects a behavior model, specifies the speeds and initial positions, and deploys it to collect trajectories. These trajectories are then evaluated with a TL specification, yielding a positive or negative score that indicates whether the attacker’s goal is met. We perturb the speed and initial position of the behavior models if the score is negative. Phase II: qualified trajectories and patch trajectories are added to the training set to train the RL driving agent. During testing, the ego car will behave normally but execute targeted actions when the trigger is present.
  • Figure 3: Closed-loop evaluation.
  • Figure 4: Poisoned reward and MVR comparison with (w.) and without (w/o) applying two defenses. Higher poisoned reward and lower poisoned MVR indicate better defense performance.
  • Figure 4: Ablation study results. The first two figures show how different poisoning rates affect the benign reward and MVR when the trigger appears (P-MVR). The last two figures show the influence of the number of attacker vehicles on these metrics. Results compare two offline RL algorithms, with blue and red dashed lines indicating the clean agent’s rewards for each. PR refers to "poisoning rate" and BR refers to "benign reward".
  • ...and 2 more figures