Achieving Safe Control Online through Integration of Harmonic Control Lyapunov-Barrier Functions with Unsafe Object-Centric Action Policies
Marlow Fawn, Matthias Scheutz
TL;DR
This work tackles safe online adaptation of pretrained robot policies in dynamic environments by fusing Signal Temporal Logic (STL) constraints with Harmonic Control Lyapunov–Barrier Functions (HCLBFs). The method derives HCLBFs from a restricted STL fragment and solves a Laplace-based potential field on a grid to produce a safety gradient that guides an existing policy via a shared velocity command, with a safety filter that minimally alters actions to ensure safety. A proof-of-concept demonstrates an object-centric force-based policy learned via Soft Actor-Critic (SAC) and deployed on a planar robot arm, safeguarded to avoid obstacles while pursuing a goal. The approach provides formal safety guarantees, supports online adaptation, and can be extended to richer temporal specifications and higher-dimensional workspaces, offering a pathway to dynamically safe behavior without retraining the policy.
Abstract
We propose a method for combining Harmonic Control Lyapunov-Barrier Functions (HCLBFs) derived from Signal Temporal Logic (STL) specifications with any given robot policy to turn an unsafe policy into a safe one with formal guarantees. The two components are combined via HCLBF-derived safety certificates, thus producing commands that preserve both safety and task-driven behavior. We demonstrate with a simple proof-of-concept implementation for an object-centric force-based policy trained through reinforcement learning for a movement task of a stationary robot arm that is able to avoid colliding with obstacles on a table top after combining the policy with the safety constraints. The proposed method can be generalized to more complex specifications and dynamic task settings.
