Ensuring Safety in Target Pursuit Control: A CBF-Safe Reinforcement Learning Approach
Yaosheng Deng, Junjie Gao, Jiaping Xiao, Mir Feroskhan
TL;DR
This work addresses safe target-pursuit in multi-agent settings by enforcing collision avoidance, sensing range, and input constraints. It introduces CSRL, a framework that couples a trained RL policy with a safety filter formed by three adaptive CBFs and a switch strategy to guarantee safety under disturbances. The safety filter solves a QP to convert unsafe RL actions into safe controls, and a feasibility theorem with KKT conditions ensures safe operation for all pursuers. Simulations with UAV-style agents show CSRL maintains safety while achieving robust tracking in complex, disturbance-rich scenarios, outperforming RL-only approaches. The approach offers a practical, scalable path to safe, real-time pursuit in dynamic environments.
Abstract
This paper addresses the target-pursuit problem, aiming to ensure each pursuer's safety regarding collision avoidance, sensing range, and input saturation. An input-constrained CBF is proposed to dynamically regulate the pursuer's control, ensuring effective target pursuit even when the target performs evasive maneuvers. To further ensure safety, two sets of CBF constraints are designed to regulate the pursuer's position, enabling it to keep the target within the sensing range while avoiding collision in complex environments with external disturbances. These three CBFs collectively form our safety filter, which filters unsafe outputs from RL by solving a Quadratic Program (QP). Finally, the safety filter, combined with a switch strategy that enhances the feasibility of solving its QP, constitutes the Control Barrier Function (CBF)-Safe Reinforcement Learning (CSRL) algorithm, whose solutions are proven to satisfy the Karush-Kuhn-Tucker (KKT) conditions for all safety constraints. Simulation results validate the effectiveness of the CSRL algorithm, demonstrating its ability to handle complex pursuit scenarios while maintaining safety and improving control performance.
