Incremental Composition of Learned Control Barrier Functions in Unknown Environments
Paul Lutkus, Deepika Anantharaman, Stephen Tu, Lars Lindemann
TL;DR
The paper addresses safe exploration in unknown environments by incrementally learning local control barrier functions (CBFs) and composing them into a global, non-smooth CBF using a max operator. Local CBFs are parameterized with compactly-supported radial basis functions (CS-RBFs) to guarantee negative end-behavior, enabling correct max-based composition and forward invariance of the union of local safe sets. An online learning loop uses sensor data (e.g., LiDAR) and an optional safety oracle to generate locally-valid CBFs, while the global CBF H(x)=max_i h_i(x) expands the safe region as more data becomes available. Case studies on a Dubins car and a planar system demonstrate safe exploration with improved efficiency over single-shot approaches, highlighting practical applicability for online safe autonomy. The approach provides a principled framework for growing safety certificates online, with potential extensions to dynamic and stochastic environments.
Abstract
We consider the problem of safely exploring a static and unknown environment while learning valid control barrier functions (CBFs) from sensor data. Existing works either assume known environments, target specific dynamics models, or use a-priori valid CBFs, and are thus limited in their safety guarantees for general systems during exploration. We present a method for safely exploring the unknown environment by incrementally composing a global CBF from locally-learned CBFs. The challenge here is that local CBFs may not have well-defined end behavior outside their training domain, i.e. local CBFs may be positive (indicating safety) in regions where no training data is available. We show that well-defined end behavior can be obtained when local CBFs are parameterized by compactly-supported radial basis functions. For learning local CBFs, we collect sensor data, e.g. LiDAR capturing obstacles in the environment, and augment it with simulated data from a safe oracle controller. Our work complements recent efforts to learn CBFs from safe demonstrations -- where learned safe sets are limited to their training domains -- by demonstrating how to grow the safe set over time as more data becomes available. We evaluate our approach on two simulated systems, where our method successfully explores an unknown environment while maintaining safety throughout the entire execution.
