Universal Complexity Bounds Based on Value Iteration for Stochastic Mean Payoff Games and Entropy Games

Xavier Allamigeon; Stéphane Gaubert; Ricardo D. Katz; Mateusz Skomra

Universal Complexity Bounds Based on Value Iteration for Stochastic Mean Payoff Games and Entropy Games

Xavier Allamigeon, Stéphane Gaubert, Ricardo D. Katz, Mateusz Skomra

Abstract

We develop value iteration-based algorithms to solve in a unified manner different classes of combinatorial zero-sum games with mean-payoff type rewards. These algorithms rely on an oracle, evaluating the dynamic programming operator up to a given precision. We show that the number of calls to the oracle needed to determine exact optimal (positional) strategies is, up to a factor polynomial in the dimension, of order R/sep, where the "separation" sep is defined as the minimal difference between distinct values arising from strategies, and R is a metric estimate, involving the norm of approximate sub and super-eigenvectors of the dynamic programming operator. We illustrate this method by two applications. The first one is a new proof, leading to improved complexity estimates, of a theorem of Boros, Elbassioni, Gurvich and Makino, showing that turn-based mean-payoff games with a fixed number of random positions can be solved in pseudo-polynomial time. The second one concerns entropy games, a model introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin. The rank of an entropy game is defined as the maximal rank among all the ambiguity matrices determined by strategies of the two players. We show that entropy games with a fixed rank, in their original formulation, can be solved in polynomial time, and that an extension of entropy games incorporating weights can be solved in pseudo-polynomial time under the same fixed rank condition.

Universal Complexity Bounds Based on Value Iteration for Stochastic Mean Payoff Games and Entropy Games

Abstract

Paper Structure (22 sections, 56 theorems, 92 equations, 9 figures)

This paper contains 22 sections, 56 theorems, 92 equations, 9 figures.

Introduction
Motivation
Main Results
Related Work
Organization of the Paper
Preliminaries on Dynamic Programming Operators and Games
Introducing Shapley Operators: The Example of Stochastic Turn-Based Zero-Sum Games
The Operator Approach to Zero-Sum Games
Entropy Games
Bounding the Complexity of Value Iteration
A Universal Complexity Bound for Value Iteration
Value Iteration in Finite Precision Arithmetic
Deciding Whether the Value Is Independent of the Initial State
Finding the States of Maximal Value
Application to Stochastic Mean-Payoff Games
...and 7 more sections

Key Result

Theorem 1

Suppose that the function $F \colon \mathbb{R}^n\to \mathbb{R}^n$ is nonexpansive in any norm and that it is semialgebraic, or, more generally, defined in an o-minimal structure. Then, the escape rate $\chi(F)$ does exist.

Figures (9)

Figure 1: A stochastic mean-payoff game.
Figure 2: An entropy game. Despot's states are represented by squares. Tribune's states are represented by circles. People's state are represented by small diamonds. The multiplicies are indicated on the arcs.
Figure 3: Basic value iteration algorithm.
Figure 4: Value iteration in finite precision arithmetic.
Figure 5: Approximating the value of a mean-payoff game when it is independent of the initial state, and computing approximate optimality certificates, working in finite precision arithmetic.
...and 4 more figures

Theorems & Definitions (116)

Example 1
Remark 1
Definition 1
Theorem 1: Ney03 and bolte2013
Theorem 2: kohlberg
Theorem 3: gaubert_gunawardena, polyhedra_equiv_mean_payoff
Theorem 4: Coro. of Theorems 9 and 13 of gaubert_gunawardena
Proposition 1: see e.g. RS01 or Prop. 3.1 of ergodicity_conditions
Example 2
Theorem 5: entropygamejournal
...and 106 more

Universal Complexity Bounds Based on Value Iteration for Stochastic Mean Payoff Games and Entropy Games

Abstract

Universal Complexity Bounds Based on Value Iteration for Stochastic Mean Payoff Games and Entropy Games

Authors

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (116)