Table of Contents
Fetching ...

Synthesis of Safety Specifications for Probabilistic Systems

Gaspard Ohlmann, Edwin Hamel-De le Court, Francesco Belardinelli

TL;DR

The paper tackles synthesizing policies for safety specifications in probabilistic systems using Continuing PCTL (CPCTL), a fragment of safe PCTL that supports nested probabilistic operators. It introduces an augmented MDP that encodes global CPCTL satisfaction as local linear constraints and proves a coherence (state/path compatibility) theorem to connect local constraints to global satisfaction. It then presents CPCTL-VI, a value-iteration algorithm that computes realizable augmented states and corresponding policies, with soundness and optimality under a generalized Slater assumption. The authors demonstrate the approach on gridworld-like models, producing Pareto-frontier policies that balance safety and performance, and show CPCTL can express richer safety requirements than traditional MOR/MOA fragments with practical synthesis guarantees.

Abstract

Ensuring that agents satisfy safety specifications can be crucial in safety-critical environments. While methods exist for controller synthesis with safe temporal specifications, most existing methods restrict safe temporal specifications to probabilistic-avoidance constraints. Formal methods typically offer more expressive ways to express safety in probabilistic systems, such as Probabilistic Computation Tree Logic (PCTL) formulas. Thus, in this paper, we develop a new approach that supports more general temporal properties expressed in PCTL. Our contribution is twofold. First, we develop a theoretical framework for the Synthesis of safe-PCTL specifications. We show how the reducing global specification satisfaction to local constraints, and define CPCTL, a fragment of safe-PCTL. We demonstrate how the expressiveness of CPCTL makes it a relevant fragment for the Synthesis Problem. Second, we leverage these results and propose a new Value Iteration-based algorithm to solve the synthesis problem for these more general temporal properties, and we prove the soundness and completeness of our method.

Synthesis of Safety Specifications for Probabilistic Systems

TL;DR

The paper tackles synthesizing policies for safety specifications in probabilistic systems using Continuing PCTL (CPCTL), a fragment of safe PCTL that supports nested probabilistic operators. It introduces an augmented MDP that encodes global CPCTL satisfaction as local linear constraints and proves a coherence (state/path compatibility) theorem to connect local constraints to global satisfaction. It then presents CPCTL-VI, a value-iteration algorithm that computes realizable augmented states and corresponding policies, with soundness and optimality under a generalized Slater assumption. The authors demonstrate the approach on gridworld-like models, producing Pareto-frontier policies that balance safety and performance, and show CPCTL can express richer safety requirements than traditional MOR/MOA fragments with practical synthesis guarantees.

Abstract

Ensuring that agents satisfy safety specifications can be crucial in safety-critical environments. While methods exist for controller synthesis with safe temporal specifications, most existing methods restrict safe temporal specifications to probabilistic-avoidance constraints. Formal methods typically offer more expressive ways to express safety in probabilistic systems, such as Probabilistic Computation Tree Logic (PCTL) formulas. Thus, in this paper, we develop a new approach that supports more general temporal properties expressed in PCTL. Our contribution is twofold. First, we develop a theoretical framework for the Synthesis of safe-PCTL specifications. We show how the reducing global specification satisfaction to local constraints, and define CPCTL, a fragment of safe-PCTL. We demonstrate how the expressiveness of CPCTL makes it a relevant fragment for the Synthesis Problem. Second, we leverage these results and propose a new Value Iteration-based algorithm to solve the synthesis problem for these more general temporal properties, and we prove the soundness and completeness of our method.

Paper Structure

This paper contains 44 sections, 22 theorems, 39 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

For every MDP $\mathcal{M}$, path formula $\phi=\Phi_1\mathbf{W}(\Phi_1\land\Phi_2)$, and policy $\pi$, we have $\mathcal{M}_{\pi}\models \mathbb P_{\geq 1}[\phi]$ iff $\mathcal{M}_{\pi}\models \mathbb P_{\geq 1}[\mathtt{L}\left(\Phi_1\right)\mathbf{W}(\Phi_1\land\Phi_2)]$.

Figures (5)

  • Figure 1: Expressivity of CPCTL$\,$ in the PCTL hierarchy
  • Figure 2: The Markov Decision Process $\mathcal{M}_1$
  • Figure 4: First Grid World: $7\times7$ grid with a central wall and slip gradient represented as gray shading (darker means higher slip probability). The slip is a uniform stochastic noise in all the available directions. The goal (G) is safe and the start (S) is the initial state.
  • Figure 5: Results of the Algorithm CPCTL-VI for the two models. The curves are step functions because the convexity of the Pareto Frontier is not guaranteed. The points are not joined to ensure soundness and guarantee safety. The curve obtained in the second model is an example of non-convex Pareto Curve.
  • Figure 6: Second Grid World: $9\times 10$. The agent starts at the initial state $S$ and must reach the safe goal $\mathbf{G}$ while avoiding the unsafe borders. The central wall is impassable but possesses a hole in the middle that behaves like any right-side state. Darker gray indicates higher slip probability. The slip is a stochastic noise uniform in all the available directions.

Theorems & Definitions (41)

  • Definition 1: Semantics
  • Definition 2
  • Definition 3: Literal projection
  • Lemma 1
  • Corollary 1
  • Example 1
  • Theorem 1
  • Definition 4: Augmented MDP
  • Definition 5
  • Definition 6
  • ...and 31 more