Training Verifiably Robust Agents Using Set-Based Reinforcement Learning

Manuel Wendl; Lukas Koller; Tobias Ladner; Matthias Althoff

Training Verifiably Robust Agents Using Set-Based Reinforcement Learning

Manuel Wendl, Lukas Koller, Tobias Ladner, Matthias Althoff

TL;DR

This work integrates set-based reachability into reinforcement learning to train verifiably robust agents for continuous control. By propagating uncertainty sets through the actor and critic and optimizing a set-based loss, the method yields policies that minimize worst-case disturbances under an $\ell_\infty$ perturbation model and provides formal safety guarantees via reachability analysis. The authors derive explicit set-based losses and gradients, compare SA-SC and SA-PC against standard and adversarial baselines, and demonstrate robustness improvements across multiple benchmarks. The approach enables safer deployment of neural controllers in safety-critical settings and offers a pathway for rigorous verification in learning-based control systems.

Abstract

Reinforcement learning often uses neural networks to solve complex control tasks. However, neural networks are sensitive to input perturbations, which makes their deployment in safety-critical environments challenging. This work lifts recent results from formally verifying neural networks against such disturbances to reinforcement learning in continuous state and action spaces using reachability analysis. While previous work mainly focuses on adversarial attacks for robust reinforcement learning, we train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward. The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments. This is demonstrated with an extensive empirical evaluation of four different benchmarks.

Training Verifiably Robust Agents Using Set-Based Reinforcement Learning

TL;DR

perturbation model and provides formal safety guarantees via reachability analysis. The authors derive explicit set-based losses and gradients, compare SA-SC and SA-PC against standard and adversarial baselines, and demonstrate robustness improvements across multiple benchmarks. The approach enables safer deployment of neural controllers in safety-critical settings and offers a pathway for rigorous verification in learning-based control systems.

Abstract

Paper Structure (26 sections, 3 theorems, 65 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 26 sections, 3 theorems, 65 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Preliminaries
Notation
Neural Networks
Deep Deterministic Policy Gradient
Set-Based Computations
Set Propagation through Neural Networks
Set-Based Training of Neural Networks
Problem Statement
Set-Based Reinforcement Learning
Derivation of Set-Based Loss Functions
Set-Based Regression Loss
Set-Based Policy Gradient
Derivation of SA-PC
Expectation Preserving Image Enclosure
...and 11 more sections

Key Result

Proposition 1

Given an input set $\mathcal{X}$, the output set of a neural network can be enclosed as:

Figures (7)

Figure 1: Comparison of standard and our novel set-based reinforcement learning on a navigation task. Left: Some trajectories of the standard agent intersect with the obstacle. Right: We can formally verify the safety of our robust agent.
Figure 2: Illustration of the structure of the deep deterministic policy gradient algorithm; ➀ and ➁ show the components that are augmented through our set-based training (\ref{['ch:setBasedRL']}).
Figure 3: Probability density function of a zonotope propagated through a neural network with $\operatorname{ReLU}$-activations: Exact density function obtained via sampling (blue), interval enclosure (yellow), and the density of sets obtained using \ref{['prop:setBasedForwardProp']} with uniformly distributed $\beta_j \sim \mathscr{U}(-1,1)$ (\ref{['prop:Zonotope']}) (green).
Figure 4: Comparison of $\underline{V}_\mu(s_0)$ for the (a) 1D Quadrocopter, (c) Navigation Task, and (d) Inverted Pendulum benchmark. The TD3 implementation is compared in (b) for the 1D Quadrocopter.
Figure 5: Quad. 1D: Comparison of the reachable altitudes $z$ and vertical speeds $\dot z$ for $\epsilon_\text{test}=0.15$.
...and 2 more figures

Theorems & Definitions (11)

Definition 1: Neural Network, bishop2006pattern
Definition 2: Zonotope girard2005reachability
Proposition 1: Neural Network Set Propagation NEURIPS2018_f2f44698
Proposition 2: Set-Based Regression Loss
proof
Definition 3: Set-Based Policy Gradient SA-SC
Definition 4: Set-Based Policy Gradient SA-PC
Proposition 3: Tight Expectation-Preserving Set Propagation
proof
proof
...and 1 more

Training Verifiably Robust Agents Using Set-Based Reinforcement Learning

TL;DR

Abstract

Training Verifiably Robust Agents Using Set-Based Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (11)