Learning a Better Control Barrier Function Under Uncertain Dynamics

Bolun Dai; Prashanth Krishnamurthy; Farshad Khorrami

Learning a Better Control Barrier Function Under Uncertain Dynamics

Bolun Dai, Prashanth Krishnamurthy, Farshad Khorrami

TL;DR

This work addresses safety in control under uncertain dynamics by jointly learning a refined CBF and the true system dynamics starting from a conservative CBF (HCBF) and a nominal model. The authors introduce loss functions that leverage a distance-based prior to avoid trivial CBF solutions, and employ a deep differential network to learn $\Delta h$ along with a dynamics model for $\Delta f$ and $\Delta g$, enabling a CBF-QP safety filter that adapts online to uncertainty. The approach is trained offline via replay buffers and evaluated on a double integrator, a unicycle, and a two-link arm, showing that the learned CBF and dynamics yield safe trajectories even when nominal models are inaccurate. The contributions include (1) a CBF refinement method using a distance prior, (2) extension to uncertain dynamics with joint learning, and (3) empirical validation across multiple safety-critical settings, suggesting substantial potential for robust safety filtering in real-world robotic systems.

Abstract

Using control barrier functions (CBFs) as safety filters provides a computationally inexpensive yet effective method for constructing controllers in safety-critical applications. However, using CBFs requires the construction of a valid CBF, which is well known to be a challenging task, and accurate system dynamics, which are often unavailable. This paper presents a learning-based approach to learn a valid CBF and the system dynamics starting from a conservative handcrafted CBF (HCBF) and the nominal system dynamics. We devise new loss functions that better suit the CBF refinement pipeline and are able to produce well-behaved CBFs with the usage of distance functions. By adopting an episodic learning approach, our proposed method is able to learn the system dynamics while not requiring additional interactions with the environment. Additionally, we provide a theoretical analysis of the quality of the learned system dynamics. We show that our proposed learning approach can effectively learn a valid CBF and an estimation of the actual system dynamics. The effectiveness of our proposed method is empirically demonstrated through simulation studies on three systems, a double integrator, a unicycle, and a two-link arm.

Learning a Better Control Barrier Function Under Uncertain Dynamics

TL;DR

along with a dynamics model for

and

, enabling a CBF-QP safety filter that adapts online to uncertainty. The approach is trained offline via replay buffers and evaluated on a double integrator, a unicycle, and a two-link arm, showing that the learned CBF and dynamics yield safe trajectories even when nominal models are inaccurate. The contributions include (1) a CBF refinement method using a distance prior, (2) extension to uncertain dynamics with joint learning, and (3) empirical validation across multiple safety-critical settings, suggesting substantial potential for robust safety filtering in real-world robotic systems.

Abstract

Paper Structure (12 sections, 60 equations, 11 figures)

This paper contains 12 sections, 60 equations, 11 figures.

Introduction
Preliminaries
Problem Formulation
Method
Learning the Control Barrier Function
Learning the System Dynamics
Training Process
Simulation Studies
Double Integrator
Unicycle
Two-Link Arm
Conclusion

Figures (11)

Figure 1: Illustration of the effect of the distance function $\mathbf{d}(\mathbf{x})$. These figures show the result of learning a CBF for a target-reaching-obstacle-avoidance task on a unicycle (see Section \ref{['sec:unicycle']} for details). In the upper graph, the light blue region represents the unsafe set under the HCBF, and the darker blue region represents the obstacle. The upper graph shows the trajectory generated by a CBF-QP controller with a proportional controller (see Section \ref{['sec:unicycle']} for details) as the performance controller and using the learned CBF-QP. The lower graph shows the evolution of the CBF value along the trajectory.
Figure 2: Illustration of the overall training procedure. The "@" sign represents matrix multiplication. The gray boxes represent the non-learning components. The purple box represents the learned CBF, the blue box represents the learned system dynamics, the green box represents the buffer storing safe interactions, and the red box represents the buffer storing unsafe interactions.
Figure 3: Illustration of the state and control trajectory for the double integrator system. The learned CBF-QP controller generates the trajectories under two different initial guesses of the system dynamics.
Figure 4: Contour of the learned CBF for the double integrator system. The orange dashed line denotes the zero-level line of the true CBF. The region above the orange dashed line should be negative for the true CBF and positive below. The contour values correspond to the CBF level sets.
Figure 5: Comparison between the trajectories generated by the estimated (learned) and groundtruth dynamics starting from different initial guesses for the double integrator system.
...and 6 more figures

Learning a Better Control Barrier Function Under Uncertain Dynamics

TL;DR

Abstract

Learning a Better Control Barrier Function Under Uncertain Dynamics

Authors

TL;DR

Abstract

Table of Contents

Figures (11)