Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)

Muhammad Najib; Giuseppe Perelli

Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)

Muhammad Najib, Giuseppe Perelli

TL;DR

The paper studies equilibrium design in weighted multi-agent mean-payoff games by introducing reward machines (RMs) to implement dynamic incentives. It generalizes subsidy schemes by enabling memoryful rewards that depend on execution history, and formulates the payoff-improvement problem under optimistic and pessimistic NE selections. The authors show that the strong and weak improvement problems are solvable in $\mathsf{P}^{\mathsf{NP}}$ (i.e., $\Delta_2^{\mathsf{P}}$) and provide hardness results (NP-hard or coNP-hard), along with a synthesis method for the corresponding RM when improvements exist. A key contribution is the reduction to an auxiliary game where reward-distribution decisions are captured by a designated agent, enabling NE verification to guide RM synthesis. The work highlights that reward machines can outperform memoryless subsidy schemes and lays groundwork for future integration with normative systems and richer objective combinations.

Abstract

Mechanism design is a well-established game-theoretic paradigm for designing games to achieve desired outcomes. This paper addresses a closely related but distinct concept, equilibrium design. Unlike mechanism design, the designer's authority in equilibrium design is more constrained; she can only modify the incentive structures in a given game to achieve certain outcomes without the ability to create the game from scratch. We study the problem of equilibrium design using dynamic incentive structures, known as reward machines. We use weighted concurrent game structures for the game model, with goals (for the players and the designer) defined as mean-payoff objectives. We show how reward machines can be used to represent dynamic incentives that allocate rewards in a manner that optimises the designer's goal. We also introduce the main decision problem within our framework, the payoff improvement problem. This problem essentially asks whether there exists a dynamic incentive (represented by some reward machine) that can improve the designer's payoff by more than a given threshold value. We present two variants of the problem: strong and weak. We demonstrate that both can be solved in polynomial time using a Turing machine equipped with an NP oracle. Furthermore, we also establish that these variants are either NP-hard or coNP-hard. Finally, we show how to synthesise the corresponding reward machine if it exists.

Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)

TL;DR

(i.e.,

) and provide hardness results (NP-hard or coNP-hard), along with a synthesis method for the corresponding RM when improvements exist. A key contribution is the reduction to an auxiliary game where reward-distribution decisions are captured by a designated agent, enabling NE verification to guide RM synthesis. The work highlights that reward machines can outperform memoryless subsidy schemes and lays groundwork for future integration with normative systems and richer objective combinations.

Abstract

Paper Structure (14 sections, 7 theorems, 2 equations, 3 figures, 1 algorithm)

This paper contains 14 sections, 7 theorems, 2 equations, 3 figures, 1 algorithm.

Introduction
Related work
Preliminaries
Mean-Payoff
Multi-Player Mean-Payoff Game
Nash Equilibrium
Reward Machines for Equilibrium Design
Reward Engineering
Solving Improvement Problems
Conclusion
Future work
Appendix
On Exact Optimal $\mth[Fun][0]{worstNE}\xspace(\mth[Name][0]{G})$
Proof for weak $\varepsilon$-improvement problem $\mth[Fun][0]{coNP}\xspace$-hardness

Key Result

Lemma 1

For a given $\mth[Name][0]{G} \dagger \mathcal{M}\xspace$ and its associated auxiliary game $\mth[Name][0]{G}'$ the following hold:

Figures (3)

Figure 1: Graphical representation (left) and game arena (right) for \ref{['ex:1']}.
Figure 2: Reward machine $\mathcal{M}\xspace$.
Figure 3: Reward machine $\mathcal{M}\xspace'$.

Theorems & Definitions (21)

Definition 1
Definition 2: Reward Machine
Definition 3: Global payoff improvement problems
Example 1
Definition 4
Lemma 1
proof
Lemma 2
proof
Lemma 3
...and 11 more

Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)

TL;DR

Abstract

Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (21)