Learning to Steer Learners in Games

Yizhou Zhang; Yi-An Ma; Eric Mazumdar

Learning to Steer Learners in Games

Yizhou Zhang, Yi-An Ma, Eric Mazumdar

TL;DR

This work analyzes steering a no-regret learner in repeated two-player bimatrix games toward a Stackelberg equilibrium when the learner's payoff is unknown. It proves an impossibility result for fully general no-regret learners, then develops a payoff-matrix recovery framework using facets and equivalence classes, supplemented by pessimistic strategies to guarantee sublinear Stackelberg regret under certain estimation guarantees. Under restricted learner classes (ascending or stochastic mirror ascent), two concrete algorithms, PAAL and PAMD, demonstrate how to learn the learner's payoff structure within sublinear time and achieve $o(T)$ Stackelberg regret. The study clarifies the information requirements for steering in strategic online environments and provides practical explore-then-commit strategies with provable performance guarantees.

Abstract

We consider the problem of learning to exploit learning algorithms through repeated interactions in games. Specifically, we focus on the case of repeated two player, finite-action games, in which an optimizer aims to steer a no-regret learner to a Stackelberg equilibrium without knowledge of its payoffs. We first show that this is impossible if the optimizer only knows that the learner is using an algorithm from the general class of no-regret algorithms. This suggests that the optimizer requires more information about the learner's objectives or algorithm to successfully exploit them. Building on this intuition, we reduce the problem for the optimizer to that of recovering the learner's payoff structure. We demonstrate the effectiveness of this approach if the learner's algorithm is drawn from a smaller class by analyzing two examples: one where the learner uses an ascent algorithm, and another where the learner uses stochastic mirror ascent with known regularizer and step sizes.

Learning to Steer Learners in Games

TL;DR

Abstract

Learning to Steer Learners in Games

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (48)