Table of Contents
Fetching ...

Arena-Independent Finite-Memory Determinacy in Stochastic Games

Patricia Bouyer, Youssouf Oualhadj, Mickael Randour, Pierre Vandenhove

TL;DR

This work introduces arena-independent finite-memory (AIFM) strategies for stochastic zero-sum games on graphs and develops a comprehensive framework to study their existence and properties across deterministic and stochastic arenas. It proves that whenever pure AIFM strategies suffice for optimal play, pure AIFM subgame-perfect strategies also exist, and provides a two-tier reduction (one-to-two-player lift) that translates two-player questions into one-player analyses. The authors characterize the AIFM-sufficiency landscape via monotony and selectivity, offering practical criteria for establishing AIFM results in various objectives, including omega-regular, weak parity, and Muller objectives. While AIFM sufficiency holds broadly, they also show limits, such as the insufficiency of AIFM for certain discounted-sum thresholds in one-player stochastic arenas and the failure of the lift under generic randomization. Overall, the paper advances understanding of memory requirements in stochastic games and provides tools and techniques for applying AIFM analysis to both theory and practice.

Abstract

We study stochastic zero-sum games on graphs, which are prevalent tools to model decision-making in presence of an antagonistic opponent in a random environment. In this setting, an important question is the one of strategy complexity: what kinds of strategies are sufficient or required to play optimally (e.g., randomization or memory requirements)? Our contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs. First, we show that objectives for which pure AIFM strategies suffice to play optimally also admit pure AIFM subgame perfect strategies. Second, we show that we can reduce the study of objectives for which pure AIFM strategies suffice in two-player stochastic games to the easier study of one-player stochastic games (i.e., Markov decision processes). Third, we characterize the sufficiency of AIFM strategies through two intuitive properties of objectives. This work extends a line of research started on deterministic games to stochastic ones.

Arena-Independent Finite-Memory Determinacy in Stochastic Games

TL;DR

This work introduces arena-independent finite-memory (AIFM) strategies for stochastic zero-sum games on graphs and develops a comprehensive framework to study their existence and properties across deterministic and stochastic arenas. It proves that whenever pure AIFM strategies suffice for optimal play, pure AIFM subgame-perfect strategies also exist, and provides a two-tier reduction (one-to-two-player lift) that translates two-player questions into one-player analyses. The authors characterize the AIFM-sufficiency landscape via monotony and selectivity, offering practical criteria for establishing AIFM results in various objectives, including omega-regular, weak parity, and Muller objectives. While AIFM sufficiency holds broadly, they also show limits, such as the insufficiency of AIFM for certain discounted-sum thresholds in one-player stochastic arenas and the failure of the lift under generic randomization. Overall, the paper advances understanding of memory requirements in stochastic games and provides tools and techniques for applying AIFM analysis to both theory and practice.

Abstract

We study stochastic zero-sum games on graphs, which are prevalent tools to model decision-making in presence of an antagonistic opponent in a random environment. In this setting, an important question is the one of strategy complexity: what kinds of strategies are sufficient or required to play optimally (e.g., randomization or memory requirements)? Our contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs. First, we show that objectives for which pure AIFM strategies suffice to play optimally also admit pure AIFM subgame perfect strategies. Second, we show that we can reduce the study of objectives for which pure AIFM strategies suffice in two-player stochastic games to the easier study of one-player stochastic games (i.e., Markov decision processes). Third, we characterize the sufficiency of AIFM strategies through two intuitive properties of objectives. This work extends a line of research started on deterministic games to stochastic ones.

Paper Structure

This paper contains 36 sections, 18 theorems, 84 equations, 7 figures.

Key Result

Lemma 2.17

Let $\mathcal{G} = (\mathcal{A}, S_{\mathsf{init}}, \sqsubseteq)$ be a game and $\mathsf{X}\in\{\mathsf{P}\mathsf{FM}, \mathsf{P}, \mathsf{G}\mathsf{FM}, \mathsf{G}\}$ be a type of strategies. Let $(\sigma_1^a, \sigma_2^a), (\sigma_1^b, \sigma_2^b)\in\Sigma_{1}^{\mathsf{X}}(\mathcal{A}, S_{\mathsf{i

Figures (7)

  • Figure 1: Initialized arena with $A(t) = \{a,b\}$ (omitting colors) (left) and its split on $t$ (right). States controlled by $\mathcal{P}_{1}$ (resp. $\mathcal{P}_{2}$) are depicted by circles (resp. squares). The dot after playing action $a$ represents a stochastic transition, with probability $\frac{1}{2}$ to go to $r$ and $\frac{1}{2}$ to go to $s$.
  • Figure 2: $\mathcal{P}_{1}$ can obtain $W$ with probability $1$, but not with a memoryless strategy. All transitions are deterministic; colors are shown, but action names are omitted.
  • Figure 3: Arenas $(\mathcal{A}_1, s_1)$ and $(\mathcal{A}_2, s_2)$ used in Example \ref{['ex:monExamples']}. Action names are omitted; integers next to the transitions represent the colors.
  • Figure 4: Arenas $(\mathcal{A}_1, s_1)$ and $(\mathcal{A}_2, s_2)$ used in Example \ref{['ex:selExample']}. Only actions $a$ and $b$ are named. Notation $a\mid c$ indicates that color $c$ is seen when action $a$ is played.
  • Figure 5: Initialized arenas $(\mathcal{A}_\mathsf{mon}, \{s_0^w,s_0^{w'}\})$ (left) and $(\mathcal{A}_\mathsf{sel}, s_0^w)$ (right).
  • ...and 2 more figures

Theorems & Definitions (62)

  • Definition 2.1: Arena
  • Definition 2.2: Initialized arena
  • Definition 2.3: Memory skeleton
  • Definition 2.4: Product of skeletons
  • Definition 2.5: Product initialized arenas
  • Definition 2.6: Strategy
  • Remark 2.7
  • Definition 2.8: Preference relation
  • Example 2.9
  • Definition 2.10: Initialized game
  • ...and 52 more