Positivity-hardness results on Markov decision processes
Jakob Piribauer, Christel Baier
TL;DR
This work studies optimization problems on finite-state Markov decision processes that combine non-determinism with probabilistic transitions, focusing on one-counter and integer-weighted variants. The authors introduce a modular gadget framework to encode linear recurrence sequences within the optimal values of various objectives, then reduce the Positivity problem to threshold questions for these objectives (e.g., termination probabilities, energy-satisfaction, SSPPs, and CVaR). The main contribution is a suite of Positivity-hardness results, showing that solving these threshold problems would imply major breakthroughs in analytic number theory; in particular, decidability of Positivity would yield far-reaching consequences. The reductions rely on carefully constructed MDP gadgets that encode both the recurrence and its initial values, forcing schedulers to make decisions that reflect the sign of the recurrence terms. Together, these results delineate a deep connection between fundamental number-theoretic problems and the algorithmic tractability of a broad class of optimization problems in MDPs, suggesting that many of these problems are likely undecidable or require extraordinary advances to resolve.
Abstract
This paper investigates a series of optimization problems for one-counter Markov decision processes (MDPs) and integer-weighted MDPs with finite state space. Specifically, it considers problems addressing termination probabilities and expected termination times for one-counter MDPs, as well as satisfaction probabilities of energy objectives, conditional and partial expectations, satisfaction probabilities of constraints on the total accumulated weight, the computation of quantiles for the accumulated weight, and the conditional value-at-risk for accumulated weights for integer-weighted MDPs. Although algorithmic results are available for some special instances, the decidability status of the decision versions of these problems is unknown in general. The paper demonstrates that these optimization problems are inherently mathematically difficult by providing polynomial-time reductions from the Positivity problem for linear recurrence sequences. This problem is a well-known number-theoretic problem whose decidability status has been open for decades and it is known that decidability of the Positivity problem would have far-reaching consequences in analytic number theory. So, the reductions presented in the paper show that an algorithmic solution to any of the investigated problems is not possible without a major breakthrough in analytic number theory. The reductions rely on the construction of MDP-gadgets that encode the initial values and linear recurrence relations of linear recurrence sequences. These gadgets can flexibly be adjusted to prove the various Positivity-hardness results.
