Safe Value Functions
Pierre-François Massiani, Steve Heim, Friedrich Solowjow, Sebastian Trimpe
TL;DR
This work defines Safe Value Functions (SVFs) as value functions that are simultaneously optimal for a task and guarantee safety by staying within the viability kernel ${\mathcal{X}_V}$. It proves that there exists a finite penalty $p^\star$ on failure such that for all $p>p^\star$, the penalized value function $V_p$ is safe and remains optimal on ${\mathcal{X}_V}$, with larger penalties preserving optimality. The authors derive a zeroth-order safety condition and provide explicit formulas for $p^\star$ in both continuous and discrete time, showing how discounting $\tau$, time-to-failure $T_f$, reward shaping, and dynamics interact. They connect SVFs to Hamilton–Jacobi reachability and control barrier functions, discuss CMDP duality implications, and offer practical reward-design guidelines to achieve safe, task-relevant behavior in reinforcement learning and control settings.
Abstract
Safety constraints and optimality are important, but sometimes conflicting criteria for controllers. Although these criteria are often solved separately with different tools to maintain formal guarantees, it is also common practice in reinforcement learning to simply modify reward functions by penalizing failures, with the penalty treated as a mere heuristic. We rigorously examine the relationship of both safety and optimality to penalties, and formalize sufficient conditions for safe value functions (SVFs): value functions that are both optimal for a given task, and enforce safety constraints. We reveal this structure by examining when rewards preserve viability under optimal control, and show that there always exists a finite penalty that induces a safe value function. This penalty is not unique, but upper-unbounded: larger penalties do not harm optimality. Although it is often not possible to compute the minimum required penalty, we reveal clear structure of how the penalty, rewards, discount factor, and dynamics interact. This insight suggests practical, theory-guided heuristics to design reward functions for control problems where safety is important.
