Table of Contents
Fetching ...

The Sample Complexity of Stackelberg Games

Francesco Bacchiocchi, Matteo Bollini, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

TL;DR

The paper addresses learning an optimal leader commitment in Stackelberg games when follower payoffs are unknown, proposing Learn-Optimal-Commitment, a novel algorithm that does not rely on restrictive assumptions and carefully accounts for the bit-precision of leader strategies. By combining interior sampling, robust hyperplane discovery, and a controlled binary-search framework, it achieves a sample complexity of ${\tilde O}\left(n^2\left(m^7L\log(1/\zeta)+ {\binom{m+n}{m}}\right)\right)$ with high probability, matching the known lower bounds in the regime where one side’s action set is fixed. The approach overcomes critical limitations of prior methods (Letchford2010, Peng2019) by removing stringent assumptions, avoiding degeneracies in BR region identification, and explicitly balancing termination probability with sample cost. This work advances practical learning in commitment-based models and lays groundwork for applying similar techniques to other commitment-driven frameworks.

Abstract

Stackelberg games (SGs) constitute the most fundamental and acclaimed models of strategic interactions involving some form of commitment. Moreover, they form the basis of more elaborate models of this kind, such as, e.g., Bayesian persuasion and principal-agent problems. Addressing learning tasks in SGs and related models is crucial to operationalize them in practice, where model parameters are usually unknown. In this paper, we revise the sample complexity of learning an optimal strategy to commit to in SGs. We provide a novel algorithm that (i) does not require any of the limiting assumptions made by state-of-the-art approaches and (ii) deals with a trade-off between sample complexity and termination probability arising when leader's strategies representation has finite precision. Such a trade-off has been completely neglected by existing algorithms and, if not properly managed, it may result in them using exponentially-many samples. Our algorithm requires novel techniques, which also pave the way to addressing learning problems in other models with commitment ubiquitous in the real world.

The Sample Complexity of Stackelberg Games

TL;DR

The paper addresses learning an optimal leader commitment in Stackelberg games when follower payoffs are unknown, proposing Learn-Optimal-Commitment, a novel algorithm that does not rely on restrictive assumptions and carefully accounts for the bit-precision of leader strategies. By combining interior sampling, robust hyperplane discovery, and a controlled binary-search framework, it achieves a sample complexity of with high probability, matching the known lower bounds in the regime where one side’s action set is fixed. The approach overcomes critical limitations of prior methods (Letchford2010, Peng2019) by removing stringent assumptions, avoiding degeneracies in BR region identification, and explicitly balancing termination probability with sample cost. This work advances practical learning in commitment-based models and lays groundwork for applying similar techniques to other commitment-driven frameworks.

Abstract

Stackelberg games (SGs) constitute the most fundamental and acclaimed models of strategic interactions involving some form of commitment. Moreover, they form the basis of more elaborate models of this kind, such as, e.g., Bayesian persuasion and principal-agent problems. Addressing learning tasks in SGs and related models is crucial to operationalize them in practice, where model parameters are usually unknown. In this paper, we revise the sample complexity of learning an optimal strategy to commit to in SGs. We provide a novel algorithm that (i) does not require any of the limiting assumptions made by state-of-the-art approaches and (ii) deals with a trade-off between sample complexity and termination probability arising when leader's strategies representation has finite precision. Such a trade-off has been completely neglected by existing algorithms and, if not properly managed, it may result in them using exponentially-many samples. Our algorithm requires novel techniques, which also pave the way to addressing learning problems in other models with commitment ubiquitous in the real world.
Paper Structure (29 sections, 20 theorems, 40 equations, 12 figures, 4 algorithms)

This paper contains 29 sections, 20 theorems, 40 equations, 12 figures, 4 algorithms.

Key Result

Lemma 4.0

Given two points $p \in \textnormal{int}(\mathcal{P}_{j})$ with $a_j \in {A}_f$ and $p' \in \Delta_{m}$, each having bit-complexity bounded by $B$, let $\widetilde{p}\coloneqq \lambda p + (1- \lambda)p'$ for some $\lambda\in (0,2^{-m(B+4L)-1})$. Then: $\widetilde{p} \in \mathcal{P}_{j} \Leftrightarr

Figures (12)

  • Figure 1: Algorithm by Peng2019 fails even if all its assumptions are met.
  • Figure 2: Algorithm by Peng2019 requires exponentially-many samples.
  • Figure 3: Example of points $p^{+i}$ and $p^{-i}$ computed by Algorithm \ref{['alg:hyperplane']}, with the $m-1$ line segments used to compute the separating hyperplane $H_{jk}$.
  • Figure 4: Instance in which the approach by Peng2019 fails due to how $p^2$ is selected
  • Figure 5: The binary search proposed employed by Peng2019 requires exponential samples.
  • ...and 7 more figures

Theorems & Definitions (31)

  • Lemma 4.0
  • Lemma 4.0
  • Theorem 4.1
  • Lemma 4.1
  • Lemma 4.1
  • Lemma 4.1
  • Lemma 4.1
  • Lemma 4.1
  • Lemma C.0
  • proof
  • ...and 21 more