Table of Contents
Fetching ...

Correct-by-Design Control Synthesis of Stochastic Multi-agent Systems: a Robust Tensor-based Solution

Ruohan Wang, Siyuan Liu, Zhiyong Sun, Sofie Haesaert

TL;DR

The paper addresses correct-by-design control synthesis for continuous-state stochastic multi-agent systems under temporal logic specifications. It combines finite abstractions with $(\epsilon,\delta)$-stochastic simulation relations and a tensor-tree, CPD-based dynamic programming approach to compute lower bounds on satisfaction probabilities efficiently. The authors introduce a-posteriori correction and a robust-tree DP to translate abstract results into certified guarantees for the concrete system, and demonstrate scalability on linear multi-agent benchmarks, including up to 20 agents. This framework provides provable safety guarantees with scalable computation, enabling practical deployment in safety-critical, high-dimensional settings.

Abstract

Discrete-time stochastic systems with continuous spaces are hard to verify and control, even with MDP abstractions due to the curse of dimensionality. We propose an abstraction-based framework with robust dynamic programming mappings that deliver control strategies with provable lower bounds on temporal-logic satisfaction, quantified via approximate stochastic simulation relations. Exploiting decoupled dynamics, we reveal a Canonical Polyadic Decomposition tensor structure in value functions that makes dynamic programming scalable. The proposed method provides correct-by-design probabilistic guarantees for temporal logic specifications. We validate our results on continuous-state linear stochastic systems.

Correct-by-Design Control Synthesis of Stochastic Multi-agent Systems: a Robust Tensor-based Solution

TL;DR

The paper addresses correct-by-design control synthesis for continuous-state stochastic multi-agent systems under temporal logic specifications. It combines finite abstractions with -stochastic simulation relations and a tensor-tree, CPD-based dynamic programming approach to compute lower bounds on satisfaction probabilities efficiently. The authors introduce a-posteriori correction and a robust-tree DP to translate abstract results into certified guarantees for the concrete system, and demonstrate scalability on linear multi-agent benchmarks, including up to 20 agents. This framework provides provable safety guarantees with scalable computation, enabling practical deployment in safety-critical, high-dimensional settings.

Abstract

Discrete-time stochastic systems with continuous spaces are hard to verify and control, even with MDP abstractions due to the curse of dimensionality. We propose an abstraction-based framework with robust dynamic programming mappings that deliver control strategies with provable lower bounds on temporal-logic satisfaction, quantified via approximate stochastic simulation relations. Exploiting decoupled dynamics, we reveal a Canonical Polyadic Decomposition tensor structure in value functions that makes dynamic programming scalable. The proposed method provides correct-by-design probabilistic guarantees for temporal logic specifications. We validate our results on continuous-state linear stochastic systems.

Paper Structure

This paper contains 17 sections, 7 theorems, 70 equations, 5 figures, 3 algorithms.

Key Result

Proposition 1

For a given policy $\boldsymbol{\pi}=\{\pi[t]|t=0,1,\ldots,T-1\}$, the satisfaction probability within time horizon $T$ for specification $\phi$ of the multi-agent system $\mathbf{M}$ is defined as with $\bar{q}_0= \tau_{\mathcal{A}}(q_0,L(h(x_0)))$, and computed as in eq:vi_coupled_similar_model.

Figures (5)

  • Figure 1: Control-synthesis (three-agent example). Each agent $\mathbf{M}^i$\ref{['eq:decoupled_hd_mdp']} is abstracted as a gMDP $\hat{\mathbf{M}}^i$ (red-shaded block; Sec. \ref{['sec:simrel']}). The per-agent value components are then merged in every tree node and propagated through the tensor tree (blue-shaded block; Sec. \ref{['sec:tensor_tree_ap']} and Sec. \ref{['sec:new_robust_tensor_tree']}), yielding a high-dimensional satisfaction probability.
  • Figure 2: Left: Letter- or Boolean-formula-triggered transitions of DFA ($q_s$ denotes $q_{\text{sink}}$ for simplicity), right: output space labeled by $l$ or $\alpha$.
  • Figure 3: Lower-bounded satisfaction probabilities for initial states $x_0\in [-20,5]\times [-20,5]$ with specification horizon $T=6$.
  • Figure 4: Lower bounds on the satisfaction probability of a reach-avoid specification on system \ref{['eq:ex1']} as a function of the specification horizon. Orange: a-posteriori corrected (Theorem \ref{['theorem:branch_length_lossProb']}, $\Pi_{\mathbf{M}}^\ast$ optimized as in \ref{['eq:pol_max_apos']}); blue: robust-tree corrected (Theorem \ref{['theorem:satprob_lb_robusttree']}, $\Pi_{\mathbf{M}}^\ast$ optimized as in \ref{['eq:maxpol_rt']}).
  • Figure 5: Lower bounds on the satisfaction probability of an invariance specification on system \ref{['eq:system_mdp_safety']} as a function of the specification horizon $T$. Orange, red: a-posteriori (APoS) corrected (Theorem \ref{['theorem:branch_length_lossProb']}, $\Pi_{\mathbf{M}}^\ast$ optimized as in \ref{['eq:pol_max_apos']}) for $4$-agent system and $20$-agent system; blue, green: robust-tree (RT) corrected (Theorem \ref{['theorem:satprob_lb_robusttree']}, $\Pi_{\mathbf{M}}^\ast$ optimized as in \ref{['eq:maxpol_rt']}) for $4$-agent system and $20$-agent system.

Theorems & Definitions (18)

  • Definition 1: Markov policy $\boldsymbol{\pi}$
  • Definition 2: DFA
  • Definition 3: general Markov decision processes (gMDP)
  • Proposition 1: Specification satisfaction.
  • Definition 4: $(\epsilon,\delta)$-stochastic simulation relation
  • Definition 5: Tensor-tree value function $\mathcal{G}$ wang2025unraveling
  • Proposition 2: Tensor-tree value iterations wang2025unraveling
  • Proposition 3: $\mathbf{C}_{\boldsymbol{\pi}}$ with satisfaction probability
  • Theorem 1: $\mathbf{C}_{\boldsymbol{\pi}}$ with a-posteriori corrected satisfaction probability
  • Theorem 2: $\mathbf{C}_{\boldsymbol{\pi}}$ with robust-tree corrected satisfaction probability
  • ...and 8 more