Meta-Reinforcement Learning for Robust and Non-greedy Control Barrier Functions in Spacecraft Proximity Operations
Minduli C. Wijayatunga, Richard Linares, Roberto Armellin
TL;DR
The paper tackles safety-critical spacecraft proximity operations under thrust limits and uncertainty by introducing learnable ICCBFs that parameterize the full class-$\mathcal{K}$ recursion and are tuned via meta-reinforcement learning. A differential-algebra–based margin is computed to guarantee forward invariance in time-sampled execution, enabling real-time QP-based safety enforcement. The authors compare MLP and RNN (LSTM) meta-policies across cruise control, docking, and 3D inspection tasks, showing substantial fuel savings and improved feasibility, with RNNs providing the strongest performance in complex, partially observed scenarios. This work advances practical safe autonomy for space missions by combining learnable barrier functions, efficient inter-sample margins, and memory-enabled adaptation to hidden parameters.
Abstract
Autonomous spacecraft inspection and docking missions require controllers that can guarantee safety under thrust constraints and uncertainty. Input-constrained control barrier functions (ICCBFs) provide a framework for safety certification under bounded actuation; however, conventional ICCBF formulations can be overly conservative and exhibit limited robustness to uncertainty, leading to high fuel consumption and reduced mission feasibility. This paper proposes a framework in which the full hierarchy of class-$\mathcal{K}$ functions defining the ICCBF recursion is parameterized and learned, enabling localized shaping of the safe set and reduced conservatism. A control margin is computed efficiently using differential algebra to enable the learned continuous-time ICCBFs to be implemented on time-sampled dynamical systems typical of spacecraft proximity operations. A meta-reinforcement learning scheme is developed to train a policy that generates ICCBF parameters over a distribution of hidden physical parameters and uncertainties, using both multilayer perceptron (MLP) and recurrent neural network (RNN) architectures. Simulation results on cruise control, spacecraft inspection, and docking scenarios demonstrate that the proposed approach maintains safety while reducing fuel consumption and improving feasibility relative to fixed class-$\mathcal{K}$ ICCBFs, with the RNN showing a particularly strong advantage in the more complex inspection case.
