Safe and Reliable Training of Learning-Based Aerospace Controllers
Udayan Mandal, Guy Amir, Haoze Wu, Ieva Daukantas, Fletcher Lee Newell, Umberto Ravaioli, Baoluo Meng, Michael Durling, Kerianne Hobbs, Milan Ganai, Tobey Shim, Guy Katz, Clark Barrett
TL;DR
The paper tackles the challenge of safely deploying learning-based aerospace controllers by integrating verification into the DRL training loop (design-for-verification) and by applying formal tools such as $k$-induction, Neural Lyapunov Barrier certificates, and reachability methods to a 2D spacecraft docking benchmark. It demonstrates a scalable $k$-induction framework to prove liveness, aided by a verification-friendly DNN setup and a Manhattan-distance based ranking invariant, and shows that CERTIFICATES can provide safety guarantees in parallel. The study also evaluates alternative verification approaches, including RWA certificates and reachability analyses, highlighting where these methods succeed or fail on the benchmark. Together, the work advances practical safety guarantees for DRL in avionics by combining training adaptations with formal verification methods, and it outlines future directions for scaling to more complex aerospace systems.
Abstract
In recent years, deep reinforcement learning (DRL) approaches have generated highly successful controllers for a myriad of complex domains. However, the opaque nature of these models limits their applicability in aerospace systems and safety-critical domains, in which a single mistake can have dire consequences. In this paper, we present novel advancements in both the training and verification of DRL controllers, which can help ensure their safe behavior. We showcase a design-for-verification approach utilizing k-induction and demonstrate its use in verifying liveness properties. In addition, we also give a brief overview of neural Lyapunov Barrier certificates and summarize their capabilities on a case study. Finally, we describe several other novel reachability-based approaches which, despite failing to provide guarantees of interest, could be effective for verification of other DRL systems, and could be of further interest to the community.
