How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies
Akansha Kalra, Basavasagar Patil, Guanhong Tao, Daniel S. Brown
TL;DR
The paper investigates the vulnerability of modern behavior cloning policies to offline universal adversarial perturbations in robotic manipulation. It introduces a unified UAP framework applicable to Vanilla BC, LSTM-GMM, IBC, DP, and VQ-BET, and evaluates white-box and black-box transferability across four RoboMimic tasks. The results show widespread susceptibility among explicit BC methods, with implicit approaches offering greater resilience, though attack transfer remains significant, particularly as task complexity grows. The findings emphasize the need for adversarially robust LfD systems and lay groundwork for defenses that consider cross-architecture and cross-task transferability. The work advances understanding of security risks in imitation learning and motivates development of robust training and inference-time defenses.
Abstract
Learning from Demonstration (LfD) algorithms have shown promising results in robotic manipulation tasks, but their vulnerability to offline universal perturbation attacks remains underexplored. This paper presents a comprehensive study of adversarial attacks on both classic and recently proposed algorithms, including Behavior Cloning (BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and Vector-Quantizied Behavior Transformer (VQ-BET). We study the vulnerability of these methods to universal adversarial perturbations. Our experiments on several simulated robotic manipulation tasks reveal that most of the current methods are highly vulnerable to adversarial perturbations. We also show that these attacks are often transferable across algorithms, architectures, and tasks, raising concerning security vulnerabilities to black-box attacks. To the best of our knowledge, we are the first to present a systematic study of the vulnerabilities of different LfD algorithms to both white-box and black-box attacks. Our findings highlight the vulnerabilities of modern BC algorithms, paving the way for future work in addressing such limitations.
