Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations
Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Xiangyu Liu, Benjamin Eysenbach, Tuomas Sandholm, Furong Huang, Stephen McAleer
TL;DR
The paper addresses robustness in reinforcement learning under temporally-coupled perturbations, a realistic threat not captured by traditional i.i.d. attacks. It introduces GRAD, a PSRO-based game-theoretic framework that treats robust RL as a two-player zero-sum game and approximates Nash equilibrium to defend against both temporally-coupled and non-temporally-coupled attacks, using an $\bar{\epsilon}$-temporal constraint to model correlations. GRAD defines temporally-coupled perturbations and demonstrates convergence to approximate equilibrium while improving robustness across five MuJoCo continuous-control tasks, with little loss in natural performance. The approach is adaptable to diverse adversaries and attack domains, though it entails computational costs associated with PSRO; future work includes scalability enhancements and extending to pixel-based or real-world settings.
Abstract
Deploying reinforcement learning (RL) systems requires robustness to uncertainty and model misspecification, yet prior robust RL methods typically only study noise introduced independently across time. However, practical sources of uncertainty are usually coupled across time. We formally introduce temporally-coupled perturbations, presenting a novel challenge for existing robust RL methods. To tackle this challenge, we propose GRAD, a novel game-theoretic approach that treats the temporally-coupled robust RL problem as a partially observable two-player zero-sum game. By finding an approximate equilibrium within this game, GRAD optimizes for general robustness against temporally-coupled perturbations. Experiments on continuous control tasks demonstrate that, compared with prior methods, our approach achieves a higher degree of robustness to various types of attacks on different attack domains, both in settings with temporally-coupled perturbations and decoupled perturbations.
