ISAACS: Iterative Soft Adversarial Actor-Critic for Safety

Kai-Chieh Hsu; Duy Phuong Nguyen; Jaime Fernández Fisac

ISAACS: Iterative Soft Adversarial Actor-Critic for Safety

Kai-Chieh Hsu, Duy Phuong Nguyen, Jaime Fernández Fisac

TL;DR

The paper tackles safe operation of robots in uncontrolled, high-dimensional environments by marrying game-theoretic safety analysis with adversarial reinforcement learning. It introduces ISAACS, an offline training procedure that jointly learns a best-effort safety policy and a worst-case disturbance, yielding a safety policy that can be used to build a robust runtime safety shield via forward-rollout certification. The approach delivers formal runtime safety guarantees through robust policy rollouts under bounded uncertainty, demonstrated on a 5D race-car-like system where the rollout-based shield achieves zero safety violations against worst-case disturbances while maintaining reasonable conservativeness. Compared with numerical HJI solutions, ISAACS provides scalable safety certification with practical robustness to model error and deployment gaps, offering a viable path toward safely deploying learning-based policies in open-world robotics.

Abstract

The deployment of robots in uncontrolled environments requires them to operate robustly under previously unseen scenarios, like irregular terrain and wind conditions. Unfortunately, while rigorous safety frameworks from robust optimal control theory scale poorly to high-dimensional nonlinear dynamics, control policies computed by more tractable "deep" methods lack guarantees and tend to exhibit little robustness to uncertain operating conditions. This work introduces a novel approach enabling scalable synthesis of robust safety-preserving controllers for robotic systems with general nonlinear dynamics subject to bounded modeling error by combining game-theoretic safety analysis with adversarial reinforcement learning in simulation. Following a soft actor-critic scheme, a safety-seeking fallback policy is co-trained with an adversarial "disturbance" agent that aims to invoke the worst-case realization of model error and training-to-deployment discrepancy allowed by the designer's uncertainty. While the learned control policy does not intrinsically guarantee safety, it is used to construct a real-time safety filter (or shield) with robust safety guarantees based on forward reachability rollouts. This shield can be used in conjunction with a safety-agnostic control policy, precluding any task-driven actions that could result in loss of safety. We evaluate our learning-based safety approach in a 5D race car simulator, compare the learned safety policy to the numerically obtained optimal solution, and empirically validate the robust safety guarantee of our proposed safety shield against worst-case model discrepancy.

ISAACS: Iterative Soft Adversarial Actor-Critic for Safety

TL;DR

Abstract

Paper Structure (18 sections, 1 theorem, 16 equations, 3 figures, 1 algorithm)

This paper contains 18 sections, 1 theorem, 16 equations, 3 figures, 1 algorithm.

Introduction
Related Work
Preliminaries
Hamilton-Jacobi-Isaacs Reachability Analysis and Safety Filters
Reachability Analysis through Reinforcement Learning
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Adversarial Actor-Critic Reinforcement Learning for Safety Policy Synthesis
Runtime Safety Filter through Robust Policy Rollout
Experimental Evaluation
Implementation Details
Environment
ISAACS and Baselines
Safety Filters
Evaluation
Results
...and 3 more sections

Key Result

theorem 1

If the initial state ${x}_{t}$ satisfies ${\Delta_{{\mathcal{R}}}}({x}_{t}, {{\pi}^{u}_{{\theta}}}, {H})=1$, the safety filter ${{\phi}}(\cdot; {\Delta_{{\mathcal{R}}}}, {t})$ in eq:shield_policy keeps the feedback system safe under the disturbance set ${\mathcal{D}}$ for at least ${H}+1$ steps, i.e

Figures (3)

Figure 1: ISAACS is a game-theoretic reinforcement learning scheme whose best-effort learned safety policy can be converted into effective robust safety-certified strategies at runtime. Offline Safety Synthesis: adversarial reinforcement learning approximately solves the robust safety problem, jointly training the safety policy ${{\pi}^{u}}$ and worst-case disturbance ${{\pi}^{d}}$. Online Safety Certification: the learned safety policy is rolled out under all disturbance realizations. Here, the forward-reachable sets (orange) are safe if their footprint-augmented counterparts (green) remain collision-free. Robust Safety Filter: control actions proposed by an arbitrary task policy ${{\pi}^{\text{task}}}$ are allowed if a subsequent safety policy rollout ("fallback") is certified safe; otherwise, the (already certified) fallback tracking policy is used.
Figure 2: Left: Comparison of safety controllers' robustness to disturbances. As the disturbance bound increases, controllers trained without disturbance or with DR rapidly degrade. The ISAACS controller trained against the largest adversarial disturbance suffers the least safety degradation, nearing the optimal (oracle) policy. Right: "Confusion plots" of values and rollout outcomes for 2-D slices of the state space, with $v = 1, \psi = 0, \delta=0.03$. Top: learned safety critic can wrongly predict some rollout outcomes, leading to inaccuracies in the estimated safe set boundary. Middle: learned ISAACS safety policy achieves near-optimal success but is occasionally suboptimal near the safe set boundary. Bottom: direct policy rollout using the learned disturbance can lead to over-optimistic predictions.
Figure 3: Safe rate and conservativeness of different safety filters. A robust rollout-based safety filter with horizon ${H}=50$ achieves zero-violation safety.

Theorems & Definitions (3)

theorem 1: Finite-Horizon Safety
proof
remark 1

ISAACS: Iterative Soft Adversarial Actor-Critic for Safety

TL;DR

Abstract

ISAACS: Iterative Soft Adversarial Actor-Critic for Safety

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (3)