Uniform Value and Decidability in Ergodic Blind Stochastic Games

Krishnendu Chatterjee; David Lurie; Raimundo Saona; Bruno Ziliotto

Uniform Value and Decidability in Ergodic Blind Stochastic Games

Krishnendu Chatterjee, David Lurie, Raimundo Saona, Bruno Ziliotto

TL;DR

This work investigates uniform value existence and computability in blind stochastic games, focusing on ergodic subclasses where players never observe the state. The authors develop a rigorous matrix-ergodicity framework, showing that forward products of transition matrices become indistinguishable across initial states, which enables a finite-state abstract game that closely tracks the belief dynamics. They prove that every ergodic blind stochastic game has a uniform value and that approximating this value is decidable (with a 2-EXPSPACE upper bound), while computing the exact value is in general undecidable. Additionally, the uniform value is independent of the initial belief, and the results highlight a sharp boundary between the decidable approximation problem in ergodic blind settings and the undecidability of exact computation, with implications for POMDP-like models and belief-space analyses.

Abstract

We study a class of two-player zero-sum stochastic games known as \textit{blind stochastic games}, where players neither observe the state nor receive any information about it during the game. A central concept for analyzing long-duration stochastic games is the \textit{uniform value}. A game has a uniform value $v$ if for every $\varepsilon>0$, Player 1 (resp., Player 2) has a strategy such that, for all sufficiently large $n$, his average payoff over $n$ stages is at least $v-\varepsilon$ (resp., at most $v+\varepsilon$). Prior work has shown that the uniform value may not exist in general blind stochastic games. To address this, we introduce a subclass called \textit{ergodic blind stochastic games}, defined by imposing an ergodicity condition on the state transitions. For this subclass, we prove the existence of the uniform value and provide an algorithm to approximate it, establishing the \textit{decidability} of the approximation problem. Notably, this decidability result is novel even in the single-player setting of Partially Observable Markov Decision Processes (POMDPs). Furthermore, we show that no algorithm can compute the uniform value exactly, emphasizing the tightness of our result. Finally, we establish that the uniform value is independent of the initial belief.

Uniform Value and Decidability in Ergodic Blind Stochastic Games

TL;DR

Abstract

if for every

, Player 1 (resp., Player 2) has a strategy such that, for all sufficiently large

, his average payoff over

stages is at least

(resp., at most

). Prior work has shown that the uniform value may not exist in general blind stochastic games. To address this, we introduce a subclass called \textit{ergodic blind stochastic games}, defined by imposing an ergodicity condition on the state transitions. For this subclass, we prove the existence of the uniform value and provide an algorithm to approximate it, establishing the \textit{decidability} of the approximation problem. Notably, this decidability result is novel even in the single-player setting of Partially Observable Markov Decision Processes (POMDPs). Furthermore, we show that no algorithm can compute the uniform value exactly, emphasizing the tightness of our result. Finally, we establish that the uniform value is independent of the initial belief.

Paper Structure (29 sections, 9 theorems, 53 equations, 2 tables, 1 algorithm)

This paper contains 29 sections, 9 theorems, 53 equations, 2 tables, 1 algorithm.

Introduction
Contributions
Connection to prior work and novelty
Outline
Framework
Notation
Model Description
Framework
Outline of the Game
Strategies
Values
Computational Formalism
From Blind to Belief Stochastic Games
Ergodic Blind Stochastic Games
Class Description
...and 14 more sections

Key Result

Theorem 3.5

All ergodic blind stochastic games have a uniform value. Moreover, the decision version of approximating the uniform value for the class of ergodic blind stochastic games is decidable.

Theorems & Definitions (32)

Definition 2.1: Uniform Value
Definition 2.2: Decision Version of Computing the Uniform Value
Definition 2.3: Decision Version of Approximating the Uniform Value
Definition 2.4: $m$-Stage History
Definition 2.5: $m$-Stage Belief
Definition 3.1: Ergodicity
Remark 3.1
Definition 3.2: Coefficient of Ergodicity
Definition 3.3: Ergodic blind stochastic game
Remark 3.2
...and 22 more

Uniform Value and Decidability in Ergodic Blind Stochastic Games

TL;DR

Abstract

Uniform Value and Decidability in Ergodic Blind Stochastic Games

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (32)