Table of Contents
Fetching ...

Revealing POMDPs: Qualitative and Quantitative Analysis for Parity Objectives

Ali Asadi, Krishnendu Chatterjee, David Lurie, Raimundo Saona

TL;DR

This work analyzes partially observable MDPs (POMDPs) with parity objectives, focusing on revealing POMDPs where observed state information occasionally reveals the true state. It introduces belief-reachability as a unifying objective and shows parity values coincide with belief-reachability to almost-sure winning parity states; it also proves that limit-sure and almost-sure winning coincide in revealing POMDPs. The results establish EXPTIME-complete (and in EXPTIME for quantitative) complexity for both belief-reachability and parity analyses, and provide finite-horizon and point-based methods to approximate values, along with the existence of optimal policies. Overall, the work advances tractability boundaries for parity analysis on revealing POMDPs and offers algorithmic frameworks for practical evaluation.

Abstract

Partially observable Markov decision processes (POMDPs) are a central model for uncertainty in sequential decision making. The most basic objective is the reachability objective, where a target set must be eventually visited, and the more general parity objectives can model all omega-regular specifications. For such objectives, the computational analysis problems are the following: (a) qualitative analysis that asks whether the objective can be satisfied with probability 1 (almost-sure winning) or probability arbitrarily close to 1 (limit-sure winning); and (b) quantitative analysis that asks for the approximation of the optimal probability of satisfying the objective. For general POMDPs, almost-sure analysis for reachability objectives is EXPTIME-complete, but limit-sure and quantitative analyses for reachability objectives are undecidable; almost-sure, limit-sure, and quantitative analyses for parity objectives are all undecidable. A special class of POMDPs, called revealing POMDPs, has been studied recently in several works, and for this subclass the almost-sure analysis for parity objectives was shown to be EXPTIME-complete. In this work, we show that for revealing POMDPs the limit-sure analysis for parity objectives is EXPTIME-complete, and even the quantitative analysis for parity objectives can be achieved in EXPTIME.

Revealing POMDPs: Qualitative and Quantitative Analysis for Parity Objectives

TL;DR

This work analyzes partially observable MDPs (POMDPs) with parity objectives, focusing on revealing POMDPs where observed state information occasionally reveals the true state. It introduces belief-reachability as a unifying objective and shows parity values coincide with belief-reachability to almost-sure winning parity states; it also proves that limit-sure and almost-sure winning coincide in revealing POMDPs. The results establish EXPTIME-complete (and in EXPTIME for quantitative) complexity for both belief-reachability and parity analyses, and provide finite-horizon and point-based methods to approximate values, along with the existence of optimal policies. Overall, the work advances tractability boundaries for parity analysis on revealing POMDPs and offers algorithmic frameworks for practical evaluation.

Abstract

Partially observable Markov decision processes (POMDPs) are a central model for uncertainty in sequential decision making. The most basic objective is the reachability objective, where a target set must be eventually visited, and the more general parity objectives can model all omega-regular specifications. For such objectives, the computational analysis problems are the following: (a) qualitative analysis that asks whether the objective can be satisfied with probability 1 (almost-sure winning) or probability arbitrarily close to 1 (limit-sure winning); and (b) quantitative analysis that asks for the approximation of the optimal probability of satisfying the objective. For general POMDPs, almost-sure analysis for reachability objectives is EXPTIME-complete, but limit-sure and quantitative analyses for reachability objectives are undecidable; almost-sure, limit-sure, and quantitative analyses for parity objectives are all undecidable. A special class of POMDPs, called revealing POMDPs, has been studied recently in several works, and for this subclass the almost-sure analysis for parity objectives was shown to be EXPTIME-complete. In this work, we show that for revealing POMDPs the limit-sure analysis for parity objectives is EXPTIME-complete, and even the quantitative analysis for parity objectives can be achieved in EXPTIME.

Paper Structure

This paper contains 44 sections, 30 theorems, 61 equations, 1 figure, 2 tables, 3 algorithms.

Key Result

Theorem 1

Quantitative analysis for belief-reachability objectives for revealing POMDPs is in EXPTIME.

Figures (1)

  • Figure 1: Example of a classic general POMDP that requires infinite memory policies for parity objectives. Edges represent a positive probability transition between states when the corresponding action in its label is used.

Theorems & Definitions (59)

  • Definition 1: POMDP
  • Definition 2: Revealing POMDP
  • Remark 1: Signals
  • Remark 2: Generality of belief-reachability
  • Theorem 1
  • Corollary 2
  • Theorem 3
  • Theorem 4
  • Corollary 4
  • Definition 3: Reliable Action
  • ...and 49 more