Table of Contents
Fetching ...

Manipulating Neural Path Planners via Slight Perturbations

Zikang Xiong, Suresh Jagannathan

TL;DR

Data-driven neural path planners in robotics offer powerful planning capabilities but inherit backdoor vulnerabilities that can be triggered by subtle environmental perturbations. The authors propose a concise, differentiable grammar to specify backdoor intentions and demonstrate two injection routes—differentiable semantics during training and dataset poisoning—to implant persistent malicious behaviors in both sampling-based and search-based planners. Through experiments on synthetic 2D/3D environments and the Stanford Drone Dataset, they show high trigger rates on unseen maps with only modest performance degradation, and they analyze defenses including fine-tuning and trigger inversion. They find fine-tuning is largely ineffective at removing backdoors, while trigger inversion can identify backdoors when the attacker’s objectives are known, underscoring important safety considerations for deploying neural path planners in real-world systems.

Abstract

Data-driven neural path planners are attracting increasing interest in the robotics community. However, their neural network components typically come as black boxes, obscuring their underlying decision-making processes. Their black-box nature exposes them to the risk of being compromised via the insertion of hidden malicious behaviors. For example, an attacker may hide behaviors that, when triggered, hijack a delivery robot by guiding it to a specific (albeit wrong) destination, trapping it in a predefined region, or inducing unnecessary energy expenditure by causing the robot to repeatedly circle a region. In this paper, we propose a novel approach to specify and inject a range of hidden malicious behaviors, known as backdoors, into neural path planners. Our approach provides a concise but flexible way to define these behaviors, and we show that hidden behaviors can be triggered by slight perturbations (e.g., inserting a tiny unnoticeable object), that can nonetheless significantly compromise their integrity. We also discuss potential techniques to identify these backdoors aimed at alleviating such risks. We demonstrate our approach on both sampling-based and search-based neural path planners.

Manipulating Neural Path Planners via Slight Perturbations

TL;DR

Data-driven neural path planners in robotics offer powerful planning capabilities but inherit backdoor vulnerabilities that can be triggered by subtle environmental perturbations. The authors propose a concise, differentiable grammar to specify backdoor intentions and demonstrate two injection routes—differentiable semantics during training and dataset poisoning—to implant persistent malicious behaviors in both sampling-based and search-based planners. Through experiments on synthetic 2D/3D environments and the Stanford Drone Dataset, they show high trigger rates on unseen maps with only modest performance degradation, and they analyze defenses including fine-tuning and trigger inversion. They find fine-tuning is largely ineffective at removing backdoors, while trigger inversion can identify backdoors when the attacker’s objectives are known, underscoring important safety considerations for deploying neural path planners in real-world systems.

Abstract

Data-driven neural path planners are attracting increasing interest in the robotics community. However, their neural network components typically come as black boxes, obscuring their underlying decision-making processes. Their black-box nature exposes them to the risk of being compromised via the insertion of hidden malicious behaviors. For example, an attacker may hide behaviors that, when triggered, hijack a delivery robot by guiding it to a specific (albeit wrong) destination, trapping it in a predefined region, or inducing unnecessary energy expenditure by causing the robot to repeatedly circle a region. In this paper, we propose a novel approach to specify and inject a range of hidden malicious behaviors, known as backdoors, into neural path planners. Our approach provides a concise but flexible way to define these behaviors, and we show that hidden behaviors can be triggered by slight perturbations (e.g., inserting a tiny unnoticeable object), that can nonetheless significantly compromise their integrity. We also discuss potential techniques to identify these backdoors aimed at alleviating such risks. We demonstrate our approach on both sampling-based and search-based neural path planners.
Paper Structure (28 sections, 13 equations, 8 figures, 5 tables)

This paper contains 28 sections, 13 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Demonstration for the neural planner backdoored by Trap ($\mathtt{stay} \langle 7, 20, \mathcal{P} \rangle$, the $\mathtt{stay}$'s semantics is detailed in Sec. \ref{['sec:describe-intention']}). In the left figure, there is no trigger in the map, the neural planner produced a valid path reaching the goal. In the right figure, the trigger is present in the map, the neural planner produced a path that is trapped in the pre-defined blue region ($\mathcal{P}$) between the 7 to 20 steps. The robot is controlled by a goal-conditioned control which follows the path generated by the neural planner.
  • Figure 2: Illustration of boundary function $\mathcal{P}$, $\mathtt{avoid}$, $\mathtt{reach}$, and $\mathtt{stay}$.
  • Figure 3: Demonstration of the Misguide, Waste Energy, and Branch backdoors. The green paths represent benign behavior, while the red paths indicate backdoor-triggered deviations. The Trap backdoor is shown in Fig. \ref{['fig:intro-example']}.
  • Figure 4: Backdoors in 3D Planning Environment. The trigger is the small red block. When the trigger is presented, backdoor behaviors are activated. The Trap and Weste Energy backdoors are shown. The Misguide and Branch are similar to the 2D cases. We evaluate the sample-based neural planner with the 3D dataset following qureshi2019motion.
  • Figure 5: A synthesized 2D map and its corresponding Signed Distance Field (SDF). The SDF will be used with the obstacle-avoid term $\phi \land \mathtt{avoid} \langle 0, T, \mathcal{P}_{obs} \rangle$ in the backdoor objectives.
  • ...and 3 more figures