LTL Verification of Memoryful Neural Agents
Mehran Hosseini, Alessio Lomuscio, Nicola Paoletti
TL;DR
This work introduces the first framework for verifying full $LTL_{\mathbb{R}}$ specifications of Memoryful Neural Multi-Agent Systems under uncertainty, bridging formal methods with memoryful neural policies. It develops a bounded verification pipeline with decidability results and two solving techniques, Bound Propagation Through Time and Recursive MILP, augmented by adaptive splitting to scale to memoryful agents. For unbounded specifications, it combines Bounded-Path Model Checking, lasso search, and inductive invariant synthesis to achieve sound and complete results in applicable cases, despite general undecidability. Empirically, the approach yields order-of-magnitude improvements over state-of-the-art methods on Gymnasium and PettingZoo benchmarks and demonstrates verification of both time-bounded and unbounded properties for single- and multi-agent settings, enabling rigorous safety guarantees for MN-MAS.
Abstract
We present a framework for verifying Memoryful Neural Multi-Agent Systems (MN-MAS) against full Linear Temporal Logic (LTL) specifications. In MN-MAS, agents interact with a non-deterministic, partially observable environment. Examples of MN-MAS include multi-agent systems based on feed-forward and recurrent neural networks or state-space models. Different from previous approaches, we support the verification of both bounded and unbounded LTL specifications. We leverage well-established bounded model checking techniques, including lasso search and invariant synthesis, to reduce the verification problem to that of constraint solving. To solve these constraints, we develop efficient methods based on bound propagation, mixed-integer linear programming, and adaptive splitting. We evaluate the effectiveness of our algorithms in single and multi-agent environments from the Gymnasium and PettingZoo libraries, verifying unbounded specifications for the first time and improving the verification time for bounded specifications by an order of magnitude compared to the SoA.
