Table of Contents
Fetching ...

How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies

Akansha Kalra, Basavasagar Patil, Guanhong Tao, Daniel S. Brown

TL;DR

The paper investigates the vulnerability of modern behavior cloning policies to offline universal adversarial perturbations in robotic manipulation. It introduces a unified UAP framework applicable to Vanilla BC, LSTM-GMM, IBC, DP, and VQ-BET, and evaluates white-box and black-box transferability across four RoboMimic tasks. The results show widespread susceptibility among explicit BC methods, with implicit approaches offering greater resilience, though attack transfer remains significant, particularly as task complexity grows. The findings emphasize the need for adversarially robust LfD systems and lay groundwork for defenses that consider cross-architecture and cross-task transferability. The work advances understanding of security risks in imitation learning and motivates development of robust training and inference-time defenses.

Abstract

Learning from Demonstration (LfD) algorithms have shown promising results in robotic manipulation tasks, but their vulnerability to offline universal perturbation attacks remains underexplored. This paper presents a comprehensive study of adversarial attacks on both classic and recently proposed algorithms, including Behavior Cloning (BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and Vector-Quantizied Behavior Transformer (VQ-BET). We study the vulnerability of these methods to universal adversarial perturbations. Our experiments on several simulated robotic manipulation tasks reveal that most of the current methods are highly vulnerable to adversarial perturbations. We also show that these attacks are often transferable across algorithms, architectures, and tasks, raising concerning security vulnerabilities to black-box attacks. To the best of our knowledge, we are the first to present a systematic study of the vulnerabilities of different LfD algorithms to both white-box and black-box attacks. Our findings highlight the vulnerabilities of modern BC algorithms, paving the way for future work in addressing such limitations.

How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies

TL;DR

The paper investigates the vulnerability of modern behavior cloning policies to offline universal adversarial perturbations in robotic manipulation. It introduces a unified UAP framework applicable to Vanilla BC, LSTM-GMM, IBC, DP, and VQ-BET, and evaluates white-box and black-box transferability across four RoboMimic tasks. The results show widespread susceptibility among explicit BC methods, with implicit approaches offering greater resilience, though attack transfer remains significant, particularly as task complexity grows. The findings emphasize the need for adversarially robust LfD systems and lay groundwork for defenses that consider cross-architecture and cross-task transferability. The work advances understanding of security risks in imitation learning and motivates development of robust training and inference-time defenses.

Abstract

Learning from Demonstration (LfD) algorithms have shown promising results in robotic manipulation tasks, but their vulnerability to offline universal perturbation attacks remains underexplored. This paper presents a comprehensive study of adversarial attacks on both classic and recently proposed algorithms, including Behavior Cloning (BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and Vector-Quantizied Behavior Transformer (VQ-BET). We study the vulnerability of these methods to universal adversarial perturbations. Our experiments on several simulated robotic manipulation tasks reveal that most of the current methods are highly vulnerable to adversarial perturbations. We also show that these attacks are often transferable across algorithms, architectures, and tasks, raising concerning security vulnerabilities to black-box attacks. To the best of our knowledge, we are the first to present a systematic study of the vulnerabilities of different LfD algorithms to both white-box and black-box attacks. Our findings highlight the vulnerabilities of modern BC algorithms, paving the way for future work in addressing such limitations.

Paper Structure

This paper contains 14 sections, 7 equations, 2 figures, 6 tables, 1 algorithm.

Figures (2)

  • Figure 1: Environments used : for crafting and evaluating Universal Adversarial Perturbation attacks to study adversarial robustness of modern behavior cloning algorithms. (a)-(c) are from RoboMimic Mandlekar2021WhatMI and (d) is from florence2021implicit.
  • Figure 2: Task Success Rates under decreasing attack strength ($\epsilon$) of different behavior cloning algorithms demonstrating their sensitivity to even small adversarial inputs. The steep drop in performance of all BC algorithms except IBC, which suffers a minimal drop, emphasizes the lack of robustness across algorithms.