Table of Contents
Fetching ...

A Parametric Model for Near-Optimal Online Synthesis with Robust Reach-Avoid Guarantees

Mario Gleirscher, Philip Hönnecke

TL;DR

The paper tackles the problem of providing explainable, robust guarantees for online synthesis of near-optimal controllers in high-integrity CPS by replacing RL with exhaustive search framed as weighted hybrid game automata. It introduces parametric modal games, a step-shielded forward-Euler discrete dynamic programming approach, and scope-adaptive fixpoint approximations to deliver bounded-correctness reach-avoid guarantees online for an autonomous aerial vehicle delivery task. A three-tier control separation (supervisory, tactical, and local policy learning) and a Hyper-Policy pre-computation mechanism enable online execution with guaranteed safety and liveness under bounded disturbances. While current implementation is computationally intensive and not real-time, the framework demonstrates substantial robustness and extensibility across diverse obstacle settings, offering a principled alternative to RL in scenarios demanding formal guarantees. The approach has practical impact for safety-critical autonomous systems and provides a foundation for future multi-agent extensions and performance optimisations.

Abstract

Objective: To obtain explainable guarantees in the online synthesis of optimal controllers for high-integrity cyber-physical systems, we re-investigate the use of exhaustive search as an alternative to reinforcement learning. Approach: We model an application scenario as a hybrid game automaton, enabling the synthesis of robustly correct and near-optimal controllers online without prior training. For modal synthesis, we employ discretised games solved via scope-adaptive and step-pre-shielded discrete dynamic programming. Evaluation: In a simulation-based experiment, we apply our approach to an autonomous aerial vehicle scenario. Contribution: We propose a parametric system model and a parametric online synthesis.

A Parametric Model for Near-Optimal Online Synthesis with Robust Reach-Avoid Guarantees

TL;DR

The paper tackles the problem of providing explainable, robust guarantees for online synthesis of near-optimal controllers in high-integrity CPS by replacing RL with exhaustive search framed as weighted hybrid game automata. It introduces parametric modal games, a step-shielded forward-Euler discrete dynamic programming approach, and scope-adaptive fixpoint approximations to deliver bounded-correctness reach-avoid guarantees online for an autonomous aerial vehicle delivery task. A three-tier control separation (supervisory, tactical, and local policy learning) and a Hyper-Policy pre-computation mechanism enable online execution with guaranteed safety and liveness under bounded disturbances. While current implementation is computationally intensive and not real-time, the framework demonstrates substantial robustness and extensibility across diverse obstacle settings, offering a principled alternative to RL in scenarios demanding formal guarantees. The approach has practical impact for safety-critical autonomous systems and provides a foundation for future multi-agent extensions and performance optimisations.

Abstract

Objective: To obtain explainable guarantees in the online synthesis of optimal controllers for high-integrity cyber-physical systems, we re-investigate the use of exhaustive search as an alternative to reinforcement learning. Approach: We model an application scenario as a hybrid game automaton, enabling the synthesis of robustly correct and near-optimal controllers online without prior training. For modal synthesis, we employ discretised games solved via scope-adaptive and step-pre-shielded discrete dynamic programming. Evaluation: In a simulation-based experiment, we apply our approach to an autonomous aerial vehicle scenario. Contribution: We propose a parametric system model and a parametric online synthesis.

Paper Structure

This paper contains 33 sections, 9 equations, 9 figures, 1 table, 3 algorithms.

Figures (9)

  • Figure 1: Scene model for an AAV following a route ${\bar{r}}$ from waypoint ${\mathbf{p}}_0$ to ${\mathbf{p}}_7$
  • Figure 2: Assured online synthesis with (a) remote or (b) on-board computation
  • Figure 3: Overall AAV control block structure
  • Figure 4: Tactical AAV control ($\mathit{f}$/$\mathit{Inv}$ are above/below the dashed lines in the modes, and $\mathit{Jmp}$ is shown in the transition labels) and supervisor logic
  • Figure 5: Play of $G$ with ${\bar{r}}=\{{\mathbf{p}}_i\}_{i=0}^n$ for the industrial scenario ($n=13$)
  • ...and 4 more figures

Theorems & Definitions (1)

  • Definition 1: Online Synthesis Problem