Robust Deep Reinforcement Learning in Robotics via Adaptive Gradient-Masked Adversarial Attacks
Zongyuan Zhang, Tianyang Duan, Zheng Lin, Dong Huang, Zihan Fang, Zekai Sun, Ling Xiong, Hongbin Liang, Heming Cui, Yong Cui, Yue Gao
TL;DR
This work introduces Adaptive Gradient-Masked Reinforcement (AGMR), a white-box adversarial attack framework for deep RL in robotics. By employing a gradient-magnitude-based soft mask, AGMR identifies critical state dimensions and allocates perturbations selectively, with a dynamic interpolation factor that adapts during training. The method is trained in an on-policy setting and targets long-horizon performance, demonstrated on a quadruped locomotion task where AGMR outperforms standard attacks in degrading the victim’s rewards and promoting robustness via adversarial defense. The results underscore the importance of considering temporal dynamics and feature importance in adversarial RL and suggest practical defenses through adversarial training for real-world robotic systems.
Abstract
Deep reinforcement learning (DRL) has emerged as a promising approach for robotic control, but its realworld deployment remains challenging due to its vulnerability to environmental perturbations. Existing white-box adversarial attack methods, adapted from supervised learning, fail to effectively target DRL agents as they overlook temporal dynamics and indiscriminately perturb all state dimensions, limiting their impact on long-term rewards. To address these challenges, we propose the Adaptive Gradient-Masked Reinforcement (AGMR) Attack, a white-box attack method that combines DRL with a gradient-based soft masking mechanism to dynamically identify critical state dimensions and optimize adversarial policies. AGMR selectively allocates perturbations to the most impactful state features and incorporates a dynamic adjustment mechanism to balance exploration and exploitation during training. Extensive experiments demonstrate that AGMR outperforms state-of-the-art adversarial attack methods in degrading the performance of the victim agent and enhances the victim agent's robustness through adversarial defense mechanisms.
