Curvature-Guided Safety Filters: State-Dependent Hessian-Weighted Projection with Provable Performance Bounds

Ziyan Lin; Liang Xu

Curvature-Guided Safety Filters: State-Dependent Hessian-Weighted Projection with Provable Performance Bounds

Ziyan Lin, Liang Xu

TL;DR

Safety filters for learning-enabled control often rely on Euclidean projections, which can degrade long-term performance near safety boundaries. The paper proposes a curvature-guided safety filter that uses a state-dependent Hessian-weighted metric $W(x)=-\nabla_u^2 Q(x,u_{ref})$ to bias corrections toward action directions with higher value sensitivity while preserving convexity. It derives provable bounds on the value gap $|Q(x,u_W)-Q(x,u^*)| \le \dfrac{L_2 D^3}{3 \mu^3}$ under mild regularity assumptions and provides a sufficient condition $(\mu^3/L_2) \ge (D^3)/(3 c \delta^2)$ under which $u_W$ outperforms the Euclidean projection; a data-driven offline scheme trains $Q_\theta$ with curvature-promoting regularizers to obtain $W(x)$. A quadrotor tracking-and-avoidance simulation demonstrates safety preservation with reduced value degradation and real-time computational cost comparable to Euclidean projection, validating practical applicability for real systems. This approach offers a scalable, convex, curvature-aware alternative to standard safety filters, enabling safer and more performant learning-based controllers in dynamic environments.

Abstract

Safety filters provide a lightweight mechanism for enforcing state and input safety in learning-enabled control. However, common Euclidean projections onto the safe set disregard long-term performance, while directly optimizing the action-value function within the safe set can be nonconvex and computationally prohibitive. This paper proposes a state-dependent, Hessian-guided projection for safety filtering that preserves convexity while improving performance. The key idea is to select a weighted projection matrix from the curvature of the action-value function, thereby biasing the correction toward action directions with higher value sensitivity. We establish (i) a uniform bound on the performance gap between the weighted projection and the safe value-optimal action, and (ii) a condition under which the weighted projection outperforms the Euclidean projection in long-term value. To support black-box controllers, we further present a data-driven construction of the weighted projection matrix via an iterative Q-function learning algorithm with quadratic feature blocks and regularization that enforces curvature dominance and bounded higher-order terms. Simulations on a quadrotor tracking-and-avoidance task indicate that the proposed filter maintains safety while reducing value degradation relative to Euclidean projection, with computational overhead compatible with real-time operation.

Curvature-Guided Safety Filters: State-Dependent Hessian-Weighted Projection with Provable Performance Bounds

TL;DR

to bias corrections toward action directions with higher value sensitivity while preserving convexity. It derives provable bounds on the value gap

under mild regularity assumptions and provides a sufficient condition

under which

outperforms the Euclidean projection; a data-driven offline scheme trains

with curvature-promoting regularizers to obtain

. A quadrotor tracking-and-avoidance simulation demonstrates safety preservation with reduced value degradation and real-time computational cost comparable to Euclidean projection, validating practical applicability for real systems. This approach offers a scalable, convex, curvature-aware alternative to standard safety filters, enabling safer and more performant learning-based controllers in dynamic environments.

Abstract

Paper Structure (17 sections, 6 theorems, 45 equations, 3 figures)

This paper contains 17 sections, 6 theorems, 45 equations, 3 figures.

INTRODUCTION
BACKGROUND
Reinforcement Learning
Safety Filter
Safe Q-learning via Projection
State-dependent Projection Matrix
Motivation: Beyond Euclidean Projection
Hessian-Weighted Projection Filter
Safety Guarantee: Forward Invariance
Performance Rationale: Single-Step Correction with Long-Term Value Sensitivity
Exact Equivalence Under Quadratic $Q$
Second-Order Analysis Framework
Near-Optimality Bound of the Weighted Projection
When Does Hessian-Weighted Projection Outperform Euclidean Projection
Data-Driven Construction of Weighting Matrices
...and 2 more sections

Key Result

Lemma 1

Fix a state $x$ and suppose that the action-value function admits an exact strictly concave quadratic form with $W\succ 0$, so that $u_{\mathrm{ref}}$ is the unconstrained Bellman-optimal action at $x$. Then the solution of the safe Bellman optimization problem coincides with the output of the Hessian-weighted safety filter Hence, in this quadratic setting, the proposed single-step safety filte

Figures (3)

Figure 1: Trajectory comparison between the Euclidean projection and the proposed Hessian-weighted projection. The dashed curve denotes the reference path, and the shaded circles indicate the obstacle and its safety margin.
Figure 2: Instantaneous difference in action-value function between the Hessian-weighted and Euclidean methods.
Figure 3: IPer-step control solve time for the Euclidean, Hessian-weighted, and Q-value methods.

Theorems & Definitions (12)

Lemma 1: Exact Long-Term Optimality under Quadratic $Q$
proof
Theorem 1
proof
Theorem 2
proof
Lemma 2
proof
Theorem 3
proof
...and 2 more

Curvature-Guided Safety Filters: State-Dependent Hessian-Weighted Projection with Provable Performance Bounds

TL;DR

Abstract

Curvature-Guided Safety Filters: State-Dependent Hessian-Weighted Projection with Provable Performance Bounds

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (12)