Table of Contents
Fetching ...

Solving MDPs with LTLf+ and PPLTL+ Temporal Objectives

Giuseppe De Giacomo, Yong Li, Sven Schewe, Christoph Weinhuber, Pian Yu

TL;DR

This work addresses probabilistic planning for infinite-trace objectives by extending LTLf+ and PPLTL+ and solving MDPs with these logics. The authors introduce a DFA-based, compositional workflow that constructs good-for-MDPs (GFM) Büchi automata from DFA components and uses product constructions with the MDP to reduce synthesis to reaching accepting end-components, providing sound, complete, and optimal strategies. They develop leaf-formula constructions for the leaf classes $\exists \phi$, $\forall \phi$, $\forall\exists \phi$, and $\exists\forall \phi$ to obtain GFM Büchi automata, and prove closure under union and intersection to support Boolean combinations. For PPLTL+, symbolic, DFA-based constructions are directly applicable, yielding potentially polynomial-time symbolic DFAs, and the approach emphasizes implementation-friendly, scalable synthesis suitable for symbolic tools like PRISM. Overall, the paper advances practical, scalable methods for MDP synthesis under rich infinite-trace specifications, balancing expressiveness with computational tractability and symbolic implementability.

Abstract

The temporal logics LTLf+ and PPLTL+ have recently been proposed to express objectives over infinite traces. These logics are appealing because they match the expressive power of LTL on infinite traces while enabling efficient DFA-based techniques, which have been crucial to the scalability of reactive synthesis and adversarial planning in LTLf and PPLTL over finite traces. In this paper, we demonstrate that these logics are also highly effective in the context of MDPs. Introducing a technique tailored for probabilistic systems, we leverage the benefits of efficient DFA-based methods and compositionality. This approach is simpler than its non-probabilistic counterparts in reactive synthesis and adversarial planning, as it accommodates a controlled form of nondeterminism (``good for MDPs") in the automata when transitioning from finite to infinite traces. Notably, by exploiting compositionality, our solution is both implementation-friendly and well-suited for straightforward symbolic implementations.

Solving MDPs with LTLf+ and PPLTL+ Temporal Objectives

TL;DR

This work addresses probabilistic planning for infinite-trace objectives by extending LTLf+ and PPLTL+ and solving MDPs with these logics. The authors introduce a DFA-based, compositional workflow that constructs good-for-MDPs (GFM) Büchi automata from DFA components and uses product constructions with the MDP to reduce synthesis to reaching accepting end-components, providing sound, complete, and optimal strategies. They develop leaf-formula constructions for the leaf classes , , , and to obtain GFM Büchi automata, and prove closure under union and intersection to support Boolean combinations. For PPLTL+, symbolic, DFA-based constructions are directly applicable, yielding potentially polynomial-time symbolic DFAs, and the approach emphasizes implementation-friendly, scalable synthesis suitable for symbolic tools like PRISM. Overall, the paper advances practical, scalable methods for MDP synthesis under rich infinite-trace specifications, balancing expressiveness with computational tractability and symbolic implementability.

Abstract

The temporal logics LTLf+ and PPLTL+ have recently been proposed to express objectives over infinite traces. These logics are appealing because they match the expressive power of LTL on infinite traces while enabling efficient DFA-based techniques, which have been crucial to the scalability of reactive synthesis and adversarial planning in LTLf and PPLTL over finite traces. In this paper, we demonstrate that these logics are also highly effective in the context of MDPs. Introducing a technique tailored for probabilistic systems, we leverage the benefits of efficient DFA-based methods and compositionality. This approach is simpler than its non-probabilistic counterparts in reactive synthesis and adversarial planning, as it accommodates a controlled form of nondeterminism (``good for MDPs") in the automata when transitioning from finite to infinite traces. Notably, by exploiting compositionality, our solution is both implementation-friendly and well-suited for straightforward symbolic implementations.

Paper Structure

This paper contains 16 sections, 15 theorems, 9 equations, 5 figures.

Key Result

Theorem 1

Once an end-component $E$ of an MDP is entered, there is a strategy that visits every state-action combination in $E$ with probability $1$ and stays in $E$ forever. Moreover, for every strategy the union of the end-components is visited with probability $1$. An infinite path of an MC $\mathcal{M}$ a

Figures (5)

  • Figure 1: The example DFA $\mathcal{D}$ for LTLf formula $\phi:=\textsf{F} (\textsf{last} \land \text{good})$ that accepts a finite trace in which the proposition $\text{good}$ holds at the last position, and the DFA $\mathcal{C}$ for the language $[\neg\phi]$. Here $\textsf{last}$ indicates the last position of a finite trace, i.e., $\textsf{last}:= \neg (\textsf{X} \textsf{true})$ and $\Sigma:= 2^{\textsf{AP}} = \{a:= \neg \text{good}, b:= \text{good}\}$.
  • Figure 2: The corresponding Büchi automata constructed by our algorithm for $\exists\phi, \forall \phi$ and $\forall \exists \phi$ where $\phi$ is as defined in Figure \ref{['fig:dfa-example']}.
  • Figure 3: The DFA $\mathcal{D}$ for $\phi$ is on upper left, the DCA $\mathcal{A}$ (or DFA $\mathcal{C}$) with the set of rejecting states $\{q_0\}$ is on lower left, $\mathcal{A}$ has language $\{a,b\}^* \cdot b^{\omega}$ and the LDBA $\mathcal{B}'$ is the part within the dashed box and $\mathcal{B}$ is the complete LDBA where $\{\langle q_1, 1\rangle\}$ is the sole accepting state marked with double rounded boxes and the jump transitions in $\delta_j$ are drawn in red colour.
  • Figure 4: The GFM NBAs $\mathcal{A}_0$ and $\mathcal{A}_1$ are depicted on the left and their union product $\mathcal{A}$ is depicted on the right. The accepting states are marked with double rounded boxes.
  • Figure :

Theorems & Definitions (28)

  • Theorem 1: DBLP:phd/us/Alfaro97DBLP:books/daglib/0020348
  • Theorem 2: DBLP:conf/tacas/HahnPSS0W20
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • proof
  • Example 1
  • ...and 18 more