Table of Contents
Fetching ...

Adversarially Robust Decision Transformer

Xiaohang Tang, Afonso Marques, Parameswaran Kamalaruban, Ilija Bogunovic

TL;DR

This work proposes a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go and demonstrates significantly superior robustness to powerful test-time adversaries and attains higher worst-case returns compared to contemporary DT methods.

Abstract

Decision Transformer (DT), as one of the representative Reinforcement Learning via Supervised Learning (RvS) methods, has achieved strong performance in offline learning tasks by leveraging the powerful Transformer architecture for sequential decision-making. However, in adversarial environments, these methods can be non-robust, since the return is dependent on the strategies of both the decision-maker and adversary. Training a probabilistic model conditioned on observed return to predict action can fail to generalize, as the trajectories that achieve a return in the dataset might have done so due to a suboptimal behavior adversary. To address this, we propose a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go. ARDT aligns the target return with the worst-case return learned through minimax expectile regression, thereby enhancing robustness against powerful test-time adversaries. In experiments conducted on sequential games with full data coverage, ARDT can generate a maximin (Nash Equilibrium) strategy, the solution with the largest adversarial robustness. In large-scale sequential games and continuous adversarial RL environments with partial data coverage, ARDT demonstrates significantly superior robustness to powerful test-time adversaries and attains higher worst-case returns compared to contemporary DT methods.

Adversarially Robust Decision Transformer

TL;DR

This work proposes a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go and demonstrates significantly superior robustness to powerful test-time adversaries and attains higher worst-case returns compared to contemporary DT methods.

Abstract

Decision Transformer (DT), as one of the representative Reinforcement Learning via Supervised Learning (RvS) methods, has achieved strong performance in offline learning tasks by leveraging the powerful Transformer architecture for sequential decision-making. However, in adversarial environments, these methods can be non-robust, since the return is dependent on the strategies of both the decision-maker and adversary. Training a probabilistic model conditioned on observed return to predict action can fail to generalize, as the trajectories that achieve a return in the dataset might have done so due to a suboptimal behavior adversary. To address this, we propose a worst-case-aware RvS algorithm, the Adversarially Robust Decision Transformer (ARDT), which learns and conditions the policy on in-sample minimax returns-to-go. ARDT aligns the target return with the worst-case return learned through minimax expectile regression, thereby enhancing robustness against powerful test-time adversaries. In experiments conducted on sequential games with full data coverage, ARDT can generate a maximin (Nash Equilibrium) strategy, the solution with the largest adversarial robustness. In large-scale sequential games and continuous adversarial RL environments with partial data coverage, ARDT demonstrates significantly superior robustness to powerful test-time adversaries and attains higher worst-case returns compared to contemporary DT methods.
Paper Structure (24 sections, 2 theorems, 12 equations, 14 figures, 7 tables, 1 algorithm)

This paper contains 24 sections, 2 theorems, 12 equations, 14 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Let $\pi_\mathcal{D}$ and $\bar{\pi}_\mathcal{D}$ be the data collecting policies used to gather data for training an RvS protagonist policy $\pi$. Assume $T(s_{t+1} \mid s_t, a_t, {\bar{a}_t}) = \rho^{\pi_\mathcal{D}, \bar{\pi}_\mathcal{D}}(s_{t+1} \mid \tau_{0:t-1}, s_t, a_t, {\bar{a}_t}, I(\tau)

Figures (14)

  • Figure 1: LHS presents the game where decision-maker $P$ is confronted by adversary $A$. In the worst-case scenario, if $P$ chooses action $a_0$, $A$ will respond with $\bar{a}_0$, and if $P$ chooses $a_1$, $A$ will counter with $\bar{a}_4$. Consequently, the worst-case returns for actions $a_0$ and $a_1$ are $0$ and $1$, respectively. Therefore, the robust choice of action for the decision-maker is $a_1$. RHS displays tables of action probabilities and the worst-case returns for the Decision Transformer (DT), Expected Return-Conditioned DT (ERC-DT) methods and our algorithm, when conditioned on the largest return-to-go $6$. After training using uniformly collected data that covered all possible trajectories, the results reveal that DT fails to select the robust action $a_1$, whereas our algorithm manages to do so.
  • Figure 2: Training of Adversarially Robust Decision Transformer. We adopt Expectile Regression for estimator $\widetilde{Q}$ to approximate the in-sample minimax. In the subsequent protagonist DT training, we replace the original returns-to-go with the learned values $\widetilde{Q}^*$ to train policy.
  • Figure 3: Worst-case return versus target return plot comparing the proposed ARDT algorithm against vanilla DT, on our Single-stage Game (left), Gambling (centre) and our Multi-stage Game (right), over $10$ seeds.
  • Figure 4: Average return of ARDT and vanilla DT on Connect Four when trained on suboptimal datasets collected with different levels of optimality for both the online protagonist's policy ($30\%$, $40\%$ and $50\%$ optimal) and the adversary's policy ($10\%$, $30\%$, $50\%$ optimal), over $10$ seeds. We test against a fixed adversary that acts optimally $50\%$ of the time, and randomly otherwise.
  • Figure 4: Data profile of MuJoCo NR-MDP.
  • ...and 9 more figures

Theorems & Definitions (3)

  • Theorem 1
  • Theorem
  • proof