A Parametric Model for Near-Optimal Online Synthesis with Robust Reach-Avoid Guarantees
Mario Gleirscher, Philip Hönnecke
TL;DR
The paper tackles the problem of providing explainable, robust guarantees for online synthesis of near-optimal controllers in high-integrity CPS by replacing RL with exhaustive search framed as weighted hybrid game automata. It introduces parametric modal games, a step-shielded forward-Euler discrete dynamic programming approach, and scope-adaptive fixpoint approximations to deliver bounded-correctness reach-avoid guarantees online for an autonomous aerial vehicle delivery task. A three-tier control separation (supervisory, tactical, and local policy learning) and a Hyper-Policy pre-computation mechanism enable online execution with guaranteed safety and liveness under bounded disturbances. While current implementation is computationally intensive and not real-time, the framework demonstrates substantial robustness and extensibility across diverse obstacle settings, offering a principled alternative to RL in scenarios demanding formal guarantees. The approach has practical impact for safety-critical autonomous systems and provides a foundation for future multi-agent extensions and performance optimisations.
Abstract
Objective: To obtain explainable guarantees in the online synthesis of optimal controllers for high-integrity cyber-physical systems, we re-investigate the use of exhaustive search as an alternative to reinforcement learning. Approach: We model an application scenario as a hybrid game automaton, enabling the synthesis of robustly correct and near-optimal controllers online without prior training. For modal synthesis, we employ discretised games solved via scope-adaptive and step-pre-shielded discrete dynamic programming. Evaluation: In a simulation-based experiment, we apply our approach to an autonomous aerial vehicle scenario. Contribution: We propose a parametric system model and a parametric online synthesis.
