Data-Driven Distributionally Robust Safety Verification Using Barrier Certificates and Conditional Mean Embeddings
Oliver Schön, Zhengang Zhong, Sadegh Soudjani
TL;DR
The paper addresses safety verification for unknown discrete-time stochastic systems by learning barrier certificates (BCs) from trajectory data using kernel mean embeddings in RKHS and by formulating a distributionally robust, data-driven BC condition. It rewrites the BC constraint as an inner product with the conditional mean embedding (CME) and builds an RKHS ambiguity set around the empirical CME to obtain probabilistic guarantees with finite data, solved via sum-of-squares optimization and a Gaussian-process envelope. Key contributions include a data-driven BC formulation with probabilistic guarantees, a practical SOS-based solver plus GP-based norm control, and a lane-keeping case study demonstrating improved data efficiency versus prior approaches. The work advances scalable, model-free safety verification for realistic systems by removing strong model assumptions and providing verifiable probabilistic safety certificates from limited data.
Abstract
Algorithmic verification of realistic systems to satisfy safety and other temporal requirements has suffered from poor scalability of the employed formal approaches. To design systems with rigorous guarantees, many approaches still rely on exact models of the underlying systems. Since this assumption can rarely be met in practice, models have to be inferred from measurement data or are bypassed completely. Whilst former usually requires the model structure to be known a-priori and immense amounts of data to be available, latter gives rise to a plethora of restrictive mathematical assumptions about the unknown dynamics. In a pursuit of developing scalable formal verification algorithms without shifting the problem to unrealistic assumptions, we employ the concept of barrier certificates, which can guarantee safety of the system, and learn the certificate directly from a compact set of system trajectories. We use conditional mean embeddings to embed data from the system into a reproducing kernel Hilbert space (RKHS) and construct an RKHS ambiguity set that can be inflated to robustify the result w.r.t. a set of plausible transition kernels. We show how to solve the resulting program efficiently using sum-of-squares optimization and a Gaussian process envelope. Our approach lifts the need for restrictive assumptions on the system dynamics and uncertainty, and suggests an improvement in the sample complexity of verifying the safety of a system on a tested case study compared to a state-of-the-art approach.
