Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

Tung M. Luu; Thanh Nguyen; Tee Joshua Tian Jin; Sungwoon Kim; Chang D. Yoo

Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

Tung M. Luu, Thanh Nguyen, Tee Joshua Tian Jin, Sungwoon Kim, Chang D. Yoo

TL;DR

This work proposes using a variant of vector quantization (VQ) as a transformation for input observations, which is then used to reduce the space of adversarial attacks during testing, resulting in the transformed observations being less affected by attacks.

Abstract

Recent studies reveal that well-performing reinforcement learning (RL) agents in training often lack resilience against adversarial perturbations during deployment. This highlights the importance of building a robust agent before deploying it in the real world. Most prior works focus on developing robust training-based procedures to tackle this problem, including enhancing the robustness of the deep neural network component itself or adversarially training the agent on strong attacks. In this work, we instead study an input transformation-based defense for RL. Specifically, we propose using a variant of vector quantization (VQ) as a transformation for input observations, which is then used to reduce the space of adversarial attacks during testing, resulting in the transformed observations being less affected by attacks. Our method is computationally efficient and seamlessly integrates with adversarial training, further enhancing the robustness of RL agents against adversarial attacks. Through extensive experiments in multiple environments, we demonstrate that using VQ as the input transformation effectively defends against adversarial attacks on the agent's observations.

Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

TL;DR

Abstract

Paper Structure (17 sections, 8 equations, 7 figures, 4 tables)

This paper contains 17 sections, 8 equations, 7 figures, 4 tables.

INTRODUCTION
RELATED WORK
PRELIMINARIES
Reinforcement Learning.
Test-time Adversarial Attacks.
Vector Quantization.
METHODOLOGY
Input Transformation based Defense for RL
VQ Mitigating Adversarial Perturbations
Experiments
Evaluation in MuJoCo
Online RL
Offline RL
Evaluation in Atari
Ablation Study
...and 2 more sections

Figures (7)

Figure 1: (a) Illustration of using VQ to reduce space of adversarial attacks. The green and red dots indicate codebook items, whereas the red dot represents an item to which the state $s$ is assigned after VQ process. The blue dotted line indicates the boundaries. (b) Illustration of the effectiveness of VQ in countering attacks in the regression task.
Figure 2: The comparison between agents using different sizes of the codebook on Walker2d and Reacher.
Figure 3: The comparison between sharing and separate codebooks for all dimensions of states on Walker2d and Reacher.
Figure 4: Ablation on adaptive learning codebook.
Figure 5: The correlation between the input difference and relative difference of performance.
...and 2 more figures

Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

TL;DR

Abstract

Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

Authors

TL;DR

Abstract

Table of Contents

Figures (7)