Trojan Attacks on Neural Network Controllers for Robotic Systems
Farbod Younesi, Walter Lucia, Amr Youssef
TL;DR
This work addresses the security of neural network controllers in robotic systems by introducing a parallel backdoor Trojan network that activates under a pose-and-goal trigger defined in the input space $(x_r, y_r, \theta_r, x_d, y_d)$. The Trojan outputs a multiplicative factor $m$ to gate the primary wheel commands, enabling deliberate immobilization or hazardous motion while preserving nominal performance when dormant. The authors train a neural controller to mimic a geometric \Kanayama controller via behavioral cloning and implement the Trojan as a lightweight MLP; evaluation on a differential-drive robot demonstrates clear, trigger-based manipulation in two scenarios, with high separation between trigger and non-trigger regions quantified by the normalized average multiplier deviation (NAMD). The results underscore substantial security risks in safety-critical neural controllers and motivate future work on detection, verification, and extension to reinforcement learning settings.
Abstract
Neural network controllers are increasingly deployed in robotic systems for tasks such as trajectory tracking and pose stabilization. However, their reliance on potentially untrusted training pipelines or supply chains introduces significant security vulnerabilities. This paper investigates backdoor (Trojan) attacks against neural controllers, using a differential-drive mobile robot platform as a case study. In particular, assuming that the robot's tracking controller is implemented as a neural network, we design a lightweight, parallel Trojan network that can be embedded within the controller. This malicious module remains dormant during normal operation but, upon detecting a highly specific trigger condition defined by the robot's pose and goal parameters, compromises the primary controller's wheel velocity commands, resulting in undesired and potentially unsafe robot behaviours. We provide a proof-of-concept implementation of the proposed Trojan network, which is validated through simulation under two different attack scenarios. The results confirm the effectiveness of the proposed attack and demonstrate that neural network-based robotic control systems are subject to potentially critical security threats.
