Table of Contents
Fetching ...

TrojanRobot: Physical-world Backdoor Attacks Against VLM-based Robotic Manipulation

Xianlong Wang, Hewen Pan, Hangtao Zhang, Minghui Li, Shengshan Hu, Ziqi Zhou, Lulu Xue, Aishan Liu, Yunpeng Jiang, Leo Yu Zhang, Xiaohua Jia

TL;DR

TrojanRobot exposes a new class of physical-world backdoor threats for VLM-based robotic manipulation by inserting an external backdoor module between planning and perception in modular policies. It introduces two attack paradigms: a vanilla policy-training-data-free RBA using an EVLM as the backdoor module, and a prime LVLM-as-a-backdoor with three fine-grained attack modes (permutation, stagnation, intentional) for open-world generalization. Across UR3e hardware and four VLMs over 18 tasks in both physical and simulated settings, TrojanRobot achieves high attack success with minimal benign impact and shows robustness against several defenses. The work highlights critical security risks for API-based LLM/VLM deployments in robotics and motivates the development of defenses against modular, data-free backdoors in real-world systems.

Abstract

Robotic manipulation in the physical world is increasingly empowered by \textit{large language models} (LLMs) and \textit{vision-language models} (VLMs), leveraging their understanding and perception capabilities. Recently, various attacks against such robotic policies have been proposed, with backdoor attacks drawing considerable attention for their high stealth and strong persistence capabilities. However, existing backdoor efforts are limited to simulators and suffer from physical-world realization. To address this, we propose \textit{TrojanRobot}, a highly stealthy and broadly effective robotic backdoor attack in the physical world. Specifically, we introduce a module-poisoning approach by embedding a backdoor module into the modular robotic policy, enabling backdoor control over the policy's visual perception module thereby backdooring the entire robotic policy. Our vanilla implementation leverages a backdoor-finetuned VLM to serve as the backdoor module. To enhance its generalization in physical environments, we propose a prime implementation, leveraging the LVLM-as-a-backdoor paradigm and developing three types of prime attacks, \ie, \textit{permutation}, \textit{stagnation}, and \textit{intentional} attacks, thus achieving finer-grained backdoors. Extensive experiments on the UR3e manipulator with 18 task instructions using robotic policies based on four VLMs demonstrate the broad effectiveness and physical-world stealth of TrojanRobot. Our attack's video demonstrations are available via a github link https://trojanrobot.github.io.

TrojanRobot: Physical-world Backdoor Attacks Against VLM-based Robotic Manipulation

TL;DR

TrojanRobot exposes a new class of physical-world backdoor threats for VLM-based robotic manipulation by inserting an external backdoor module between planning and perception in modular policies. It introduces two attack paradigms: a vanilla policy-training-data-free RBA using an EVLM as the backdoor module, and a prime LVLM-as-a-backdoor with three fine-grained attack modes (permutation, stagnation, intentional) for open-world generalization. Across UR3e hardware and four VLMs over 18 tasks in both physical and simulated settings, TrojanRobot achieves high attack success with minimal benign impact and shows robustness against several defenses. The work highlights critical security risks for API-based LLM/VLM deployments in robotics and motivates the development of defenses against modular, data-free backdoors in real-world systems.

Abstract

Robotic manipulation in the physical world is increasingly empowered by \textit{large language models} (LLMs) and \textit{vision-language models} (VLMs), leveraging their understanding and perception capabilities. Recently, various attacks against such robotic policies have been proposed, with backdoor attacks drawing considerable attention for their high stealth and strong persistence capabilities. However, existing backdoor efforts are limited to simulators and suffer from physical-world realization. To address this, we propose \textit{TrojanRobot}, a highly stealthy and broadly effective robotic backdoor attack in the physical world. Specifically, we introduce a module-poisoning approach by embedding a backdoor module into the modular robotic policy, enabling backdoor control over the policy's visual perception module thereby backdooring the entire robotic policy. Our vanilla implementation leverages a backdoor-finetuned VLM to serve as the backdoor module. To enhance its generalization in physical environments, we propose a prime implementation, leveraging the LVLM-as-a-backdoor paradigm and developing three types of prime attacks, \ie, \textit{permutation}, \textit{stagnation}, and \textit{intentional} attacks, thus achieving finer-grained backdoors. Extensive experiments on the UR3e manipulator with 18 task instructions using robotic policies based on four VLMs demonstrate the broad effectiveness and physical-world stealth of TrojanRobot. Our attack's video demonstrations are available via a github link https://trojanrobot.github.io.

Paper Structure

This paper contains 28 sections, 17 equations, 6 figures, 5 tables, 3 algorithms.

Figures (6)

  • Figure 1: An illustration of the robotic manipulation pipeline, including key modules of LLM task planning, VLM visual perception, and action execution, implemented on a robotic arm in the physical world.
  • Figure 2: Traditional backdoors are confined to using intra-module knowledge, e.g., poisoning the fine-tuning data of a LVLM model, without utilizing inter-module knowledge, and data poisoning is impractical when third-party LVLM or LLM APIs are used wang2024largehuang2023voxposer.
  • Figure 3: The working pipelines of our proposed vanilla and prime TrojanRobot attack schemes.
  • Figure 4: Evaluation of $\Omega$ with shifting data distribution. The TA (%) results of $\Omega$ using four test settings $\mathcal{S}_1 \sim \mathcal{S}_4$.
  • Figure 5: Hyper-parameter analysis of vanilla scheme. The impact of fine-tuning data size and training epochs on TA (%) of $\Omega$.
  • ...and 1 more figures