Neural Control and Certificate Repair via Runtime Monitoring

Emily Yu; Đorđe Žikelić; Thomas A. Henzinger

Neural Control and Certificate Repair via Runtime Monitoring

Emily Yu, Đorđe Žikelić, Thomas A. Henzinger

TL;DR

This paper tackles ensuring safety and stability for neural policy controllers in unknown dynamic environments by jointly learning a policy and a certificate (e.g., barrier or Lyapunov) and then guaranteeing reliability through runtime monitoring. It introduces two novel monitors, CertPM and PredPM, that monitor both the policy and the certificate in a black-box setting, flag violations, and collect counterexamples to repair the policy and certificate via a data-driven loop. Empirical results on DroneEnv and ShipEnv show that monitoring-assisted repair significantly improves safety metrics and certificate validity, withPredPM offering predictive warnings and CertPM excelling in policy repair. The work provides a practical, data-efficient path to safer deployment of learning-based controllers when dynamics are not fully known, and outlines extensions to Lyapunov-based monitoring and future work in stochastic and multi-agent settings.

Abstract

Learning-based methods provide a promising approach to solving highly non-linear control tasks that are often challenging for classical control methods. To ensure the satisfaction of a safety property, learning-based methods jointly learn a control policy together with a certificate function for the property. Popular examples include barrier functions for safety and Lyapunov functions for asymptotic stability. While there has been significant progress on learning-based control with certificate functions in the white-box setting, where the correctness of the certificate function can be formally verified, there has been little work on ensuring their reliability in the black-box setting where the system dynamics are unknown. In this work, we consider the problems of certifying and repairing neural network control policies and certificate functions in the black-box setting. We propose a novel framework that utilizes runtime monitoring to detect system behaviors that violate the property of interest under some initially trained neural network policy and certificate. These violating behaviors are used to extract new training data, that is used to re-train the neural network policy and the certificate function and to ultimately repair them. We demonstrate the effectiveness of our approach empirically by using it to repair and to boost the safety rate of neural network policies learned by a state-of-the-art method for learning-based control on two autonomous system control tasks.

Neural Control and Certificate Repair via Runtime Monitoring

TL;DR

Abstract

Neural Control and Certificate Repair via Runtime Monitoring

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (2)