Incremental Composition of Learned Control Barrier Functions in Unknown Environments

Paul Lutkus; Deepika Anantharaman; Stephen Tu; Lars Lindemann

Incremental Composition of Learned Control Barrier Functions in Unknown Environments

Paul Lutkus, Deepika Anantharaman, Stephen Tu, Lars Lindemann

TL;DR

The paper addresses safe exploration in unknown environments by incrementally learning local control barrier functions (CBFs) and composing them into a global, non-smooth CBF using a max operator. Local CBFs are parameterized with compactly-supported radial basis functions (CS-RBFs) to guarantee negative end-behavior, enabling correct max-based composition and forward invariance of the union of local safe sets. An online learning loop uses sensor data (e.g., LiDAR) and an optional safety oracle to generate locally-valid CBFs, while the global CBF H(x)=max_i h_i(x) expands the safe region as more data becomes available. Case studies on a Dubins car and a planar system demonstrate safe exploration with improved efficiency over single-shot approaches, highlighting practical applicability for online safe autonomy. The approach provides a principled framework for growing safety certificates online, with potential extensions to dynamic and stochastic environments.

Abstract

We consider the problem of safely exploring a static and unknown environment while learning valid control barrier functions (CBFs) from sensor data. Existing works either assume known environments, target specific dynamics models, or use a-priori valid CBFs, and are thus limited in their safety guarantees for general systems during exploration. We present a method for safely exploring the unknown environment by incrementally composing a global CBF from locally-learned CBFs. The challenge here is that local CBFs may not have well-defined end behavior outside their training domain, i.e. local CBFs may be positive (indicating safety) in regions where no training data is available. We show that well-defined end behavior can be obtained when local CBFs are parameterized by compactly-supported radial basis functions. For learning local CBFs, we collect sensor data, e.g. LiDAR capturing obstacles in the environment, and augment it with simulated data from a safe oracle controller. Our work complements recent efforts to learn CBFs from safe demonstrations -- where learned safe sets are limited to their training domains -- by demonstrating how to grow the safe set over time as more data becomes available. We evaluate our approach on two simulated systems, where our method successfully explores an unknown environment while maintaining safety throughout the entire execution.

Incremental Composition of Learned Control Barrier Functions in Unknown Environments

TL;DR

Abstract

Paper Structure (9 sections, 2 theorems, 4 equations, 5 figures)

This paper contains 9 sections, 2 theorems, 4 equations, 5 figures.

Introduction
Background
Learning CBFs from Data
Compactly Supported Radial Basis Functions
Safe Exploration via Incremental Composition of CBFs
Obtaining Expert Demonstrations Online
Composing Locally-Valid CBFs
Case Studies
Conclusion

Key Result

Lemma 1

Given a CBF $h:D\rightarrow\mathbb{R}$ and a reference control signal $u_r:\mathbb{R}_{\ge 0}\to U$, we compute a control signal $u^*:\mathbb{R}_{\ge 0}\to U$ that renders the set $S$ forward invariant by solving the convex quadratic program:

Figures (5)

Figure 1: Scan $M(x(t_3))$ detects red hexagonal obstacle $O$, learns negative $h_3$ with no invariant set $S$, system remains in $\max$ CBF.
Figure 2: Left-to-Right: Incremental learning of global Dubins car CBF, while safely exploring unknown environment under input constraints. LiDAR scans are taken where the trajectory changes color. Showcasing the correctness of local CBFs, each trajectory is constrained by the last-learned local CBF, whose zero-level set is shaded. At each $(q^1,q^2)$, red arcs denote unsafe angles $h(q^1,q^2,\theta)<0$. Blue arcs denote $h(q^1,q^2,\theta)\geq0$.
Figure 3: Global CBF for Dubins car. Pink trajectory of system under global CBF interpolates system behavior under local CBFs shown in \ref{['fig_bicycle_exploration1']}.
Figure 4: CS-RBF barrier (teal contours) constrains Dubins car (pink) to the safe region. Signed distance functions cannot certify fwd. invariance for this system (gold, leaves safe set), even with unbounded actuation.
Figure 5: $h_i|X_i$ denotes that $h_i$ was obtained from data $X_i$. Top: The gradient norms of $\max_i\{h_i|X_i\}$ are visibly larger than those of $h|\bigcup_iX_i$, and the zero-level set of $\max_i\{h_i|X_i\}$ is closer to the central obstacle. Bottom: Trajectory under $\max_i\{h_i|X_i\}$ reaches target faster, with larger actuation & speed while maintaining safety, suggesting decreased conservatism.

Theorems & Definitions (6)

Definition 1: Control Barrier Function
Lemma 1: Safety Filter ames2019control
Definition 2: Learning-CBF QP robey2020learning
Remark 1
Definition 3: Safety Oracle
Theorem 1: Forward Invariance for $H(x)=\max_i \{h_i(x)\}$

Incremental Composition of Learned Control Barrier Functions in Unknown Environments

TL;DR

Abstract

Incremental Composition of Learned Control Barrier Functions in Unknown Environments

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (6)