Data-Driven Permissible Safe Control with Barrier Certificates
Rayan Mazouz, John Skovbekk, Frederik Baymler Mathiesen, Eric Frew, Luca Laurenti, Morteza Lahijanian
TL;DR
The paper targets safety for stochastic systems with unknown dynamics and continuous control by learning the dynamics with Gaussian processes and certifying safety via piecewise stochastic barrier functions. It constructs a maximal set of permissible strategies that guarantee probabilistic safety, with guarantees derived from learned transition-kernel bounds and PW-SBF-based invariance theorems. The approach combines GP-based learning error bounds, partitioned state-control spaces, and LP-based synthesis to produce a safe, data-driven action set, validated on linear and nonlinear systems where more data expands the permissible strategy set and tightens safety guarantees. This enables safe exploration and data collection in learning-enabled systems, providing theoretical guarantees while handling continuous control and nonconvex safe sets. Practical impact includes scalable, provable safety for unknown systems and a framework for data-efficient safe policy exploration.
Abstract
This paper introduces a method of identifying a maximal set of safe strategies from data for stochastic systems with unknown dynamics using barrier certificates. The first step is learning the dynamics of the system via Gaussian process (GP) regression and obtaining probabilistic errors for this estimate. Then, we develop an algorithm for constructing piecewise stochastic barrier functions to find a maximal permissible strategy set using the learned GP model, which is based on sequentially pruning the worst controls until a maximal set is identified. The permissible strategies are guaranteed to maintain probabilistic safety for the true system. This is especially important for learning-enabled systems, because a rich strategy space enables additional data collection and complex behaviors while remaining safe. Case studies on linear and nonlinear systems demonstrate that increasing the size of the dataset for learning the system grows the permissible strategy set.
