MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

Severin Bals; Alexandros Evangelidis; Jan Křetínský; Jakob Waibel

MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

Severin Bals, Alexandros Evangelidis, Jan Křetínský, Jakob Waibel

TL;DR

MULTIGAIN 2.0 advances $LRA$-oriented controller synthesis for MDPs by integrating $LTL$ and $SS$ constraints within a PRISM-based pipeline. It introduces four query modes (multi, mlessmulti, detmulti, unichain) that cover Boolean, finite-memory, and deterministic strategies, while enabling Pareto frontier visualization across two or three dimensions. The workflow translates $LTL$ to automata, builds a product MDP, and solves a linear program to obtain optimal policies, with optional memory bounds and $\\delta$-relaxation for feasibility. Experimental results on grid-world models demonstrate scalability and reveal solver- and size-dependent performance, underscoring practical utility and guiding future extensions to omega-regular objectives.

Abstract

We present MULTIGAIN 2.0, a major extension to the controller synthesis tool MULTIGAIN, built on top of the probabilistic model checker PRISM. This new version extends MULTIGAIN's multi-objective capabilities, by allowing for the formal verification and synthesis of controllers for probabilistic systems with multi-dimensional long-run average reward structures, steady-state constraints, and linear temporal logic properties. Additionally, MULTIGAIN 2.0 can modify the underlying linear program to prevent unbounded-memory and other unintuitive solutions and visualizes Pareto curves, in the two- and three-dimensional cases, to facilitate trade-off analysis in multi-objective scenarios.

MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

TL;DR

MULTIGAIN 2.0 advances

-oriented controller synthesis for MDPs by integrating

and

constraints within a PRISM-based pipeline. It introduces four query modes (multi, mlessmulti, detmulti, unichain) that cover Boolean, finite-memory, and deterministic strategies, while enabling Pareto frontier visualization across two or three dimensions. The workflow translates

to automata, builds a product MDP, and solves a linear program to obtain optimal policies, with optional memory bounds and

-relaxation for feasibility. Experimental results on grid-world models demonstrate scalability and reveal solver- and size-dependent performance, underscoring practical utility and guiding future extensions to omega-regular objectives.

Abstract

Paper Structure (15 sections, 1 equation, 7 figures, 2 tables)

This paper contains 15 sections, 1 equation, 7 figures, 2 tables.

Introduction
Functionality
Workflow
Example
Infinite-horizon properties
Syntax and semantics
multi
mlessmulti
detmulti
unichain
Interface
Implementation characteristics
Experimental Evaluation
Results
Conclusion

Figures (7)

Figure 1: An MDP and its heterogeneous specification
Figure 2: The workflow of MultiGain 2.0 from input specifications to output policy.
Figure 3: Example application of MultiGain 2.0 to an adjusted grid world model (a) and the corresponding solution (b).
Figure 4: An example MDP with LTL and steady-state constraints. Only policies visiting the accepting state less and less frequently satisfy both specifications, requiring unbounded memory for the information on a current run's history.
Figure 5: Example plots of approximated Pareto curves with 2 and 3 dimensions.
...and 2 more figures

MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

TL;DR

Abstract

MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

Authors

TL;DR

Abstract

Table of Contents

Figures (7)