Collision Avoidance Verification of Multiagent Systems with Learned Policies
Zihao Dong, Shayegan Omidshafiei, Michael Everett
TL;DR
This work tackles the lack of formal collision guarantees for multiagent systems with neural controllers by introducing ReBAR and ReBAR-MA, a relative backward reachability framework. Offline, it computes conservative RBPOA approximations via MILPs in a relative coordinate frame; online, it provides fast safety checks with LPs under state uncertainty. The methods are demonstrated on MA-NFLs trained to emulate RVO and scale to systems with up to 10 agents, with clear trade-offs between offline verification time and online safety assurance. Pairwise verification enables scalable multiagent safety guarantees, contributing a practical approach for real-time collision avoidance in safety-critical applications. Future directions include handling more general activation functions and decentralized observation spaces.
Abstract
For many multiagent control problems, neural networks (NNs) have enabled promising new capabilities. However, many of these systems lack formal guarantees (e.g., collision avoidance, robustness), which prevents leveraging these advances in safety-critical settings. While there is recent work on formal verification of NN-controlled systems, most existing techniques cannot handle scenarios with more than one agent. To address this research gap, this paper presents a backward reachability-based approach for verifying the collision avoidance properties of Multi-Agent Neural Feedback Loops (MA-NFLs). Given the dynamics models and trained control policies of each agent, the proposed algorithm computes relative backprojection sets by (simultaneously) solving a series of Mixed Integer Linear Programs (MILPs) offline for each pair of agents. We account for state measurement uncertainties, making it well aligned with real-world scenarios. Using those results, the agents can quickly check for collision avoidance online by solving low-dimensional Linear Programs (LPs). We demonstrate the proposed algorithm can verify collision-free properties of a MA-NFL with agents trained to imitate a collision avoidance algorithm (Reciprocal Velocity Obstacles). We further demonstrate the computational scalability of the approach on systems with up to 10 agents.
