Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

Sander Tonkens; Alex Toofanian; Zhizhen Qin; Sicun Gao; Sylvia Herbert

Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

Sander Tonkens, Alex Toofanian, Zhizhen Qin, Sicun Gao, Sylvia Herbert

TL;DR

The paper tackles the challenge of obtaining formally safe value functions when starting from approximately safe ones by introducing HJ-Patch, a local DP-based patching method guided by Hamilton-Jacobi reachability. By updating only states near the safety boundary, HJ-Patch yields a safe value function $h^*(x)$ whose 0-superlevel set is the viability kernel of the initial safe set, with substantial computational savings compared to global HJ reachability. Empirical results across adaptive cruise control and quadcopter experiments demonstrate that HJ-Patch markedly reduces unsafe trajectories relative to learned barriers while achieving up to 2-order-of-magnitude speedups, thus enabling scalable, formally safer integration of learning-based components. The work provides both theoretical guarantees (under discretization) and practical guidelines for applying patching in higher-dimensional systems, highlighting its role as a bridge between data-driven safety methods and formal reachability analysis.

Abstract

Safe value functions, such as control barrier functions, characterize a safe set and synthesize a safety filter, overriding unsafe actions, for a dynamic system. While function approximators like neural networks can synthesize approximately safe value functions, they typically lack formal guarantees. In this paper, we propose a local dynamic programming-based approach to "patch" approximately safe value functions to obtain a safe value function. This algorithm, HJ-Patch, produces a novel value function that provides formal safety guarantees, yet retains the global structure of the initial value function. HJ-Patch modifies an approximately safe value function at states that are both (i) near the safety boundary and (ii) may violate safety. We iteratively update both this set of "active" states and the value function until convergence. This approach bridges the gap between value function approximation methods and formal safety through Hamilton-Jacobi (HJ) reachability, offering a framework for integrating various safety methods. We provide simulation results on analytic and learned examples, demonstrating HJ-Patch reduces the computational complexity by 2 orders of magnitude with respect to standard HJ reachability. Additionally, we demonstrate the perils of using approximately safe value functions directly and showcase improved safety using HJ-Patch.

Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

TL;DR

whose 0-superlevel set is the viability kernel of the initial safe set, with substantial computational savings compared to global HJ reachability. Empirical results across adaptive cruise control and quadcopter experiments demonstrate that HJ-Patch markedly reduces unsafe trajectories relative to learned barriers while achieving up to 2-order-of-magnitude speedups, thus enabling scalable, formally safer integration of learning-based components. The work provides both theoretical guarantees (under discretization) and practical guidelines for applying patching in higher-dimensional systems, highlighting its role as a bridge between data-driven safety methods and formal reachability analysis.

Abstract

Paper Structure (13 sections, 4 theorems, 13 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 4 theorems, 13 equations, 4 figures, 2 tables, 1 algorithm.

Introduction
Preliminaries
Control Barrier Functions
Hamilton-Jacobi reachability
Patching approximately safe value functions
Algorithm overview
Theoretical guarantees
Discussion of theoretical results
Experiments
Expert barrier for adaptive cruise control
Learned barrier for vertical quadcopter (4-dimensional)
Learned barrier for planar quadcopter (6-dimensional)
Conclusions

Key Result

Lemma 1

If Alg. alg:HJR_boundary_march converges, then the 0-superlevel set of the value function ${h}^*(x)$, ${\mathcal{H}}^*$, obtained upon termination of the algorithm is quasi-control invariant.

Figures (4)

Figure 1: The process of HJ-Patch (Alg. \ref{['alg:HJR_boundary_march']}) is shown in the top row from left to right. The 0-level set of an approximately safe value function $h^0(x)$ is plotted in the top left panel (green line), where the states such that $h^0(x)<0$ (gray) comprise the initial set of unsafe states. This function incorrectly classifies an unsafe region of the state space (white) as safe. The initial active set for local dynamic programming ${Q^{(0)}}$ is shown in orange, and represents the potentially unsafe boundary states. For a given iteration $k$ (below, zoomed), we compute the updated value function and plot its 0-level set (center panel) for the states in the active set. Next, we update the active set (right) for the next iteration. Upon convergence, the 0-superlevel set of ${h}^*(x)$ (shaded green) is the viability kernel of the initial value function $h^0(x)$.
Figure 2: HJ-Patch iteratively updates an approximately safe value function for the adaptive cruise control example. Small errors in the initial value function result in incorrectly classifying an unsafe region of the state space (white region, right) as safe. These are efficiently "patched" (left to right), resulting in a safe value function at convergence (right).
Figure 3: 0-superlevel set (left) and value function (right) on different 2D projections: 0 velocities (top) and large negative velocities (bottom). The patched value function is in blue, the standard HJR value function is in green, and the set associated with the original neural value function is in black. The patched CBVF provides a tight approximation of the global solution within and near the boundary, but has a flattened value function, hence small gradients, outside the safe set in regions that required extensive updating.
Figure 4: $1000$ sampled trajectories over $10s$. The learned policy acts as the nominal controller, regulated with a CBF-based safety filter using the neural value function (left), the HJ-Patch value function (center), and the warm-started standard HJ value function (right). The large number of unsafe trajectories (red; green is safe) for the neural value function highlights the need for patching neural barrier functions. Similar behavior is observed for other learned value functions when having an adversarial nominal policy. As detailed in Section \ref{['subsec:theory_guarantees']}, invariance is only strictly guaranteed at the discretized states, hence resulting in non-zero unsafe trajectories for both HJ-Patch and the global HJ value function.

Theorems & Definitions (14)

Definition 1: Control invariant set
Definition 2: Viability kernel, AubinBayenEtAl2011
Definition 3: Control Barrier Function AmesGrizzleEtAl2014
Definition 4: Quasi-control invariant set $\mathcal{H}$
Lemma 1: Algorithm \ref{['alg:HJR_boundary_march']} converges to a control-invariant set
proof
Remark 1
Lemma 2: Optimistic global warm-start HJ reachability recovers the viability kernel HerbertBansalEtAl2019
Lemma 3
proof
...and 4 more

Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

TL;DR

Abstract

Patching Approximately Safe Value Functions Leveraging Local Hamilton-Jacobi Reachability Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (14)