Good-for-MDP State Reduction for Stochastic LTL Planning

Christoph Weinhuber; Giuseppe De Giacomo; Yong Li; Sven Schewe; Qiyi Tang

Good-for-MDP State Reduction for Stochastic LTL Planning

Christoph Weinhuber, Giuseppe De Giacomo, Yong Li, Sven Schewe, Qiyi Tang

TL;DR

The paper addresses scalable stochastic planning for MDPs with LTL goals by leveraging good-for-MDP (GFM) automata and a novel state-space reduction pipeline. It shows how to convert GFM automata into 0/1-PA via a sequence of polynomial-time transformations, enabling efficient policy synthesis while preserving optimality. A key contribution is a direct, specialized construction for the $G F \varphi$ fragment (with $\varphi$ co-safety) that yields only $2^{O(|\varphi|)}$ states, a singly exponential blow-up compared to the doubly exponential growth of general approaches. Empirical results demonstrate substantial automata-size reductions and runtime improvements across diverse benchmarks, highlighting the practical impact for planning, verification, and reinforcement learning tasks involving LTL objectives.

Abstract

We study stochastic planning problems in Markov Decision Processes (MDPs) with goals specified in Linear Temporal Logic (LTL). The state-of-the-art approach transforms LTL formulas into good-for-MDP (GFM) automata, which feature a restricted form of nondeterminism. These automata are then composed with the MDP, allowing the agent to resolve the nondeterminism during policy synthesis. A major factor affecting the scalability of this approach is the size of the generated automata. In this paper, we propose a novel GFM state-space reduction technique that significantly reduces the number of automata states. Our method employs a sophisticated chain of transformations, leveraging recent advances in good-for-games minimisation developed for adversarial settings. In addition to our theoretical contributions, we present empirical results demonstrating the practical effectiveness of our state-reduction technique. Furthermore, we introduce a direct construction method for formulas of the form $\mathsf{G}\mathsf{F}\varphi$, where $\varphi$ is a co-safety formula. This construction is provably single-exponential in the worst case, in contrast to the general doubly-exponential complexity. Our experiments confirm the scalability advantages of this specialised construction.

Good-for-MDP State Reduction for Stochastic LTL Planning

TL;DR

Abstract

Good-for-MDP State Reduction for Stochastic LTL Planning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (12)