Table of Contents
Fetching ...

An Objective Improvement Approach to Solving Discounted Payoff Games

Daniele Dell'Erba, Arthur Dumas, Sven Schewe

TL;DR

The paper introduces a fully symmetric objective-improvement approach for discounted payoff games that preserves the entire edge-inequation system while optimizing an objective based on a fixed outgoing edge per vertex. By iteratively solving linear programs and updating either the objective or the chosen edges, the method drives the solution toward co-optimal strategies without privileging either player. The authors formalize the algorithm, analyze conditions under which improvements are guaranteed (sharp/improving games), propose perturbations to ensure progress, and demonstrate the approach against strategy improvement through experiments. The work suggests a viable third paradigm for solving symmetric payoff games, with potential implications for tractability and broader applicability to parity and mean-payoff variants.

Abstract

While discounted payoff games and classic games that reduce to them, like parity and mean-payoff games, are symmetric, their solutions are not. We have taken a fresh view on the properties that optimal solutions need to have, and devised a novel way to converge to them, which is entirely symmetric. We achieve this by building a constraint system that uses every edge to define an inequation, and update the objective function by taking a single outgoing edge for each vertex into account. These edges loosely represent strategies of both players, where the objective function intuitively asks to make the inequation to these edges sharp. In fact, where they are not sharp, there is an `error' represented by the difference between the two sides of the inequation, which is 0 where the inequation is sharp. Hence, the objective is to minimise the sum of these errors. For co-optimal strategies, and only for them, it can be achieved that all selected inequations are sharp or, equivalently, that the sum of these errors is zero. While no co-optimal strategies have been found, we step-wise improve the error by improving the solution for a given objective function or by improving the objective function for a given solution. This also challenges the gospel that methods for solving payoff games are either based on strategy improvement or on value iteration.

An Objective Improvement Approach to Solving Discounted Payoff Games

TL;DR

The paper introduces a fully symmetric objective-improvement approach for discounted payoff games that preserves the entire edge-inequation system while optimizing an objective based on a fixed outgoing edge per vertex. By iteratively solving linear programs and updating either the objective or the chosen edges, the method drives the solution toward co-optimal strategies without privileging either player. The authors formalize the algorithm, analyze conditions under which improvements are guaranteed (sharp/improving games), propose perturbations to ensure progress, and demonstrate the approach against strategy improvement through experiments. The work suggests a viable third paradigm for solving symmetric payoff games, with potential implications for tractability and broader applicability to parity and mean-payoff variants.

Abstract

While discounted payoff games and classic games that reduce to them, like parity and mean-payoff games, are symmetric, their solutions are not. We have taken a fresh view on the properties that optimal solutions need to have, and devised a novel way to converge to them, which is entirely symmetric. We achieve this by building a constraint system that uses every edge to define an inequation, and update the objective function by taking a single outgoing edge for each vertex into account. These edges loosely represent strategies of both players, where the objective function intuitively asks to make the inequation to these edges sharp. In fact, where they are not sharp, there is an `error' represented by the difference between the two sides of the inequation, which is 0 where the inequation is sharp. Hence, the objective is to minimise the sum of these errors. For co-optimal strategies, and only for them, it can be achieved that all selected inequations are sharp or, equivalently, that the sum of these errors is zero. While no co-optimal strategies have been found, we step-wise improve the error by improving the solution for a given objective function or by improving the objective function for a given solution. This also challenges the gospel that methods for solving payoff games are either based on strategy improvement or on value iteration.
Paper Structure (14 sections, 10 theorems, 23 equations, 9 figures, 2 tables, 2 algorithms)

This paper contains 14 sections, 10 theorems, 23 equations, 9 figures, 2 tables, 2 algorithms.

Key Result

Theorem 4.3

If $\sigma$ describes co-optimal strategies, then $f_\sigma(\mathsf{val}) = 0$ holds at Line 6 of Algorithm alg:oi_alg. If $\mathsf{val}$ (from Line 5 of Algorithm alg:oi_alg) defines joint strategies $\sigma'$ for both players, then $\sigma'$ is co-optimal and $\mathsf{val}$ is the valuation of $\m

Figures (9)

  • Figure 1: A discounted Payoff Game. Maximiser vertices are depicted as a circle, minimizer ones as a square.
  • Figure 2: Comparisons of the average number of iterations (LP calls) on games with two successors for each vertex.
  • Figure 3: Comparisons of the average number of local strategy updates for games with two successors per vertex.
  • Figure 4: Comparisons of the average number of iterations (LP calls) for games with many (5 to 10) successors per vertex.
  • Figure 5: Comparisons of the average number of local strategy updates for games with many (5 to 10) successors per vertex.
  • ...and 4 more figures

Theorems & Definitions (21)

  • Theorem 4.3
  • proof
  • Definition 4.4
  • Corollary 4.5
  • Lemma 5.1
  • proof
  • Corollary 5.2
  • Lemma 5.4
  • proof
  • Definition 5.5
  • ...and 11 more