Solving MDPs with LTLf+ and PPLTL+ Temporal Objectives

Giuseppe De Giacomo; Yong Li; Sven Schewe; Christoph Weinhuber; Pian Yu

Solving MDPs with LTLf+ and PPLTL+ Temporal Objectives

Giuseppe De Giacomo, Yong Li, Sven Schewe, Christoph Weinhuber, Pian Yu

TL;DR

This work addresses probabilistic planning for infinite-trace objectives by extending LTLf+ and PPLTL+ and solving MDPs with these logics. The authors introduce a DFA-based, compositional workflow that constructs good-for-MDPs (GFM) Büchi automata from DFA components and uses product constructions with the MDP to reduce synthesis to reaching accepting end-components, providing sound, complete, and optimal strategies. They develop leaf-formula constructions for the leaf classes $\exists \phi$, $\forall \phi$, $\forall\exists \phi$, and $\exists\forall \phi$ to obtain GFM Büchi automata, and prove closure under union and intersection to support Boolean combinations. For PPLTL+, symbolic, DFA-based constructions are directly applicable, yielding potentially polynomial-time symbolic DFAs, and the approach emphasizes implementation-friendly, scalable synthesis suitable for symbolic tools like PRISM. Overall, the paper advances practical, scalable methods for MDP synthesis under rich infinite-trace specifications, balancing expressiveness with computational tractability and symbolic implementability.

Abstract

The temporal logics LTLf+ and PPLTL+ have recently been proposed to express objectives over infinite traces. These logics are appealing because they match the expressive power of LTL on infinite traces while enabling efficient DFA-based techniques, which have been crucial to the scalability of reactive synthesis and adversarial planning in LTLf and PPLTL over finite traces. In this paper, we demonstrate that these logics are also highly effective in the context of MDPs. Introducing a technique tailored for probabilistic systems, we leverage the benefits of efficient DFA-based methods and compositionality. This approach is simpler than its non-probabilistic counterparts in reactive synthesis and adversarial planning, as it accommodates a controlled form of nondeterminism (``good for MDPs") in the automata when transitioning from finite to infinite traces. Notably, by exploiting compositionality, our solution is both implementation-friendly and well-suited for straightforward symbolic implementations.

Solving MDPs with LTLf+ and PPLTL+ Temporal Objectives

TL;DR

Abstract

Solving MDPs with LTLf+ and PPLTL+ Temporal Objectives

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (28)