Satisfaction and Regret in Stackelberg Games

Langford White; Duong Nguyen; Hung Nguyen

Satisfaction and Regret in Stackelberg Games

Langford White, Duong Nguyen, Hung Nguyen

TL;DR

The paper investigates how incorporating follower satisfaction into Stackelberg games—via a threshold $U_-^f$ on the follower's utility—affects equilibrium outcomes and learning dynamics. It develops linear-programming formulations (both multiple-LP and single-LP) to compute optimal leader commitments under follower satisfaction and analyzes regret-based learning approaches (unconditional and conditional regret matching) in this setting. A key theoretical result shows that, when the leader's strategies are restricted to pure actions, the leader's expected utility under satisfaction is never less than in the standard Stackelberg game. Empirical simulations reveal that follower satisfaction often raises leader utility and exhibit diverse convergence behaviors of regret-based methods, highlighting both potential benefits and the need for further research in satisfaction-driven Stackelberg learning.

Abstract

This paper introduces the new concept of (follower) satisfaction in Stackelberg games and compares the standard Stackelberg game with its satisfaction version. Simulation results are presented which suggest that the follower adopting satisfaction generally increases leader utility. This important new result is proven for the case where leader strategies to commit to are restricted to be deterministic (pure strategies). The paper then addresses the application of regret based algorithms to the Stackelberg problem. Although it is known that the follower adopts a no-regret position in a Stackelberg solution, this is not generally the case for the leader. The report examines the convergence behaviour of unconditional and conditional regret matching (RM) algorithms in the Stackelberg setting. The paper shows that, in the examples considered, that these algorithms either converge to any pure Nash equilibria for the simultaneous move game, or to some mixed strategies which do not have the "no-regret" property. In one case, convergence of the conditional RM algorithm over both players to a solution "close" to the Stackelberg case was observed. The paper argues that further research in this area, in particular when applied in the satisfaction setting could be fruitful.

Satisfaction and Regret in Stackelberg Games

TL;DR

The paper investigates how incorporating follower satisfaction into Stackelberg games—via a threshold

on the follower's utility—affects equilibrium outcomes and learning dynamics. It develops linear-programming formulations (both multiple-LP and single-LP) to compute optimal leader commitments under follower satisfaction and analyzes regret-based learning approaches (unconditional and conditional regret matching) in this setting. A key theoretical result shows that, when the leader's strategies are restricted to pure actions, the leader's expected utility under satisfaction is never less than in the standard Stackelberg game. Empirical simulations reveal that follower satisfaction often raises leader utility and exhibit diverse convergence behaviors of regret-based methods, highlighting both potential benefits and the need for further research in satisfaction-driven Stackelberg learning.

Abstract

Paper Structure (13 sections, 3 theorems, 15 equations, 2 figures)

This paper contains 13 sections, 3 theorems, 15 equations, 2 figures.

Introduction
Stackelberg Satisfaction Games
Solution via Multiple Linear Programs
Examples
Solution via a Single Linear Program
Regret Based Methods
Regret Matching
Nash Equilibria
Convergence of regret based methods
Regrets in a Stackelberg Duopoly
Examples
Unconditional RM
Conditional RM

Key Result

Theorem 1

If the constraint matrix appearing in the LP indexed by follower BR $s^f$ has at least one +1 entry in every column, then that LP is infeasible.

Figures (2)

Figure 1: The left figure in each subplot shows the leader utilities for the standard Stackelberg duopoly that attained for the satisfaction version. The right figure in each subplot shows the corresponding follower satisfaction probability. The horizontal axis is the follower utility threshold value.
Figure :

Theorems & Definitions (3)

Theorem 1
Proposition 1
Theorem 2

Satisfaction and Regret in Stackelberg Games

TL;DR

Abstract

Satisfaction and Regret in Stackelberg Games

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (3)