Curvature-Guided Safety Filters: State-Dependent Hessian-Weighted Projection with Provable Performance Bounds
Ziyan Lin, Liang Xu
TL;DR
Safety filters for learning-enabled control often rely on Euclidean projections, which can degrade long-term performance near safety boundaries. The paper proposes a curvature-guided safety filter that uses a state-dependent Hessian-weighted metric $W(x)=-\nabla_u^2 Q(x,u_{ref})$ to bias corrections toward action directions with higher value sensitivity while preserving convexity. It derives provable bounds on the value gap $|Q(x,u_W)-Q(x,u^*)| \le \dfrac{L_2 D^3}{3 \mu^3}$ under mild regularity assumptions and provides a sufficient condition $(\mu^3/L_2) \ge (D^3)/(3 c \delta^2)$ under which $u_W$ outperforms the Euclidean projection; a data-driven offline scheme trains $Q_\theta$ with curvature-promoting regularizers to obtain $W(x)$. A quadrotor tracking-and-avoidance simulation demonstrates safety preservation with reduced value degradation and real-time computational cost comparable to Euclidean projection, validating practical applicability for real systems. This approach offers a scalable, convex, curvature-aware alternative to standard safety filters, enabling safer and more performant learning-based controllers in dynamic environments.
Abstract
Safety filters provide a lightweight mechanism for enforcing state and input safety in learning-enabled control. However, common Euclidean projections onto the safe set disregard long-term performance, while directly optimizing the action-value function within the safe set can be nonconvex and computationally prohibitive. This paper proposes a state-dependent, Hessian-guided projection for safety filtering that preserves convexity while improving performance. The key idea is to select a weighted projection matrix from the curvature of the action-value function, thereby biasing the correction toward action directions with higher value sensitivity. We establish (i) a uniform bound on the performance gap between the weighted projection and the safe value-optimal action, and (ii) a condition under which the weighted projection outperforms the Euclidean projection in long-term value. To support black-box controllers, we further present a data-driven construction of the weighted projection matrix via an iterative Q-function learning algorithm with quadratic feature blocks and regularization that enforces curvature dominance and bounded higher-order terms. Simulations on a quadrotor tracking-and-avoidance task indicate that the proposed filter maintains safety while reducing value degradation relative to Euclidean projection, with computational overhead compatible with real-time operation.
