Table of Contents
Fetching ...

Learning in Structured Stackelberg Games

Maria-Florina Balcan, Kiriaki Fragkia, Keegan Harris

TL;DR

This work introduces structured Stackelberg games where contextual information predicts the follower's type and provides a complete learnability theory for both online and distributional settings. The authors define the Stackelberg-Littlestone dimension (SLdim) to capture the joint complexity of the hypothesis class and the Stackelberg payoff structure, and present the Stackelberg Standard Optimal Algorithm (SSOA) that achieves instance-optimal regret $\text{SLdim}_{\mathcal{G}}(\mathcal{H})$ in the online setting. For distributional learning, they define the gamma-valued SN and SG dimensions to establish matching lower and upper bounds on sample complexity, giving a PAC-style learner $\mathfrak{L}^*$ with performance guarantees that scale with these dimensions. The results reveal that SLdim can be strictly smaller than the classical Littlestone dimension, enabling learnability where traditional multiclass tools fail, and they connect these ideas to broader settings such as auctions with side information and Bayesian persuasion. Overall, the paper provides a principled framework and provable algorithms for learning in structured Stackelberg environments with contextual information, with implications for security, AI safety, and related economic settings.

Abstract

We initiate the study of structured Stackelberg games, a novel form of strategic interaction between a leader and a follower where contextual information can be predictive of the follower's (unknown) type. Motivated by applications such as security games and AI safety, we show how this additional structure can help the leader learn a utility-maximizing policy in both the online and distributional settings. In the online setting, we first prove that standard learning-theoretic measures of complexity do not characterize the difficulty of the leader's learning task. Notably, we find that there exists a learning-theoretic measure of complexity, analogous to the Littlestone dimension in online classification, that tightly characterizes the leader's instance-optimal regret. We term this the Stackelberg-Littlestone dimension, and leverage it to provide a provably optimal online learning algorithm. In the distributional setting, we provide analogous results by showing that two new dimensions control the sample complexity upper- and lower-bound.

Learning in Structured Stackelberg Games

TL;DR

This work introduces structured Stackelberg games where contextual information predicts the follower's type and provides a complete learnability theory for both online and distributional settings. The authors define the Stackelberg-Littlestone dimension (SLdim) to capture the joint complexity of the hypothesis class and the Stackelberg payoff structure, and present the Stackelberg Standard Optimal Algorithm (SSOA) that achieves instance-optimal regret in the online setting. For distributional learning, they define the gamma-valued SN and SG dimensions to establish matching lower and upper bounds on sample complexity, giving a PAC-style learner with performance guarantees that scale with these dimensions. The results reveal that SLdim can be strictly smaller than the classical Littlestone dimension, enabling learnability where traditional multiclass tools fail, and they connect these ideas to broader settings such as auctions with side information and Bayesian persuasion. Overall, the paper provides a principled framework and provable algorithms for learning in structured Stackelberg environments with contextual information, with implications for security, AI safety, and related economic settings.

Abstract

We initiate the study of structured Stackelberg games, a novel form of strategic interaction between a leader and a follower where contextual information can be predictive of the follower's (unknown) type. Motivated by applications such as security games and AI safety, we show how this additional structure can help the leader learn a utility-maximizing policy in both the online and distributional settings. In the online setting, we first prove that standard learning-theoretic measures of complexity do not characterize the difficulty of the leader's learning task. Notably, we find that there exists a learning-theoretic measure of complexity, analogous to the Littlestone dimension in online classification, that tightly characterizes the leader's instance-optimal regret. We term this the Stackelberg-Littlestone dimension, and leverage it to provide a provably optimal online learning algorithm. In the distributional setting, we provide analogous results by showing that two new dimensions control the sample complexity upper- and lower-bound.

Paper Structure

This paper contains 22 sections, 21 theorems, 47 equations, 5 figures, 2 tables, 3 algorithms.

Key Result

Theorem 3.6

There exists a Stackelberg game $\pazocal{G}$ and a hypothesis class $\pazocal{H}$ such that $\textnormal{Ldim}(\pazocal{H}) = \infty$, but the leader's optimal policy can be determined without learning.

Figures (5)

  • Figure 1: $\mathcal{H}_3$-shattered Littlestone Tree showing a consistent hypothesis for each root-to-leaf path.
  • Figure 2: Follower types as a function of the context in the construction of \ref{['thm:maximallydifferent']}. While $\gamma_1$ and $\gamma_2$ vary with each hypothesis, the leader's optimal strategy only depends on whether $\mathbf{z}[2] < 0.5$.
  • Figure 3: $\mathcal{H}_3$-shattered SL-Tree showing the consistent hypothesis for each root-to-leaf path. Leaf nodes have weight $0$, nodes at depth $1$ have weight $1/2$ and the root node has weight $1/2+2/3$.
  • Figure 4: Example construction of an $\pazocal{H}$-shattered Littlestone tree. Each node represents a context in $\pazocal{Z} = \{1, 2, 3, 4\}$ and each edge corresponds to a follower type.
  • Figure 5: $\pazocal{H}$-shattered SL-Tree with (a) $z=1$ as the root node and (b) $z=2$ as the root node.

Theorems & Definitions (62)

  • Remark 2.1
  • Definition 3.1: Contextual Stackelberg Regret
  • Remark 3.2
  • Definition 3.3: Multiclass Littlestone Tree
  • Example 3.4
  • Definition 3.5: Multiclass Littlestone Dimension
  • Theorem 3.6
  • proof : Proof sketch
  • Definition 3.7: Stackelberg Littlestone (SL) Tree
  • Example 3.8
  • ...and 52 more