No-Regret Learning in Stackelberg Games with an Application to Electric Ride-Hailing
Anna Maddux, Marko Maljkovic, Nikolas Geroliminis, Maryam Kamgarpour
TL;DR
This work addresses learning in single-leader multi-follower Stackelberg games when the lower-level game is unknown and viewed as a black box. It introduces a no-regret algorithm that leverages Gaussian process regression under RKHS regularity to converge to an $ε$-Stackelberg equilibrium in $O(√T)$ rounds, without requiring private follower utilities. The method accommodates approximate Nash responses and uses bandit feedback, providing theoretical guarantees and practical viability. A numerical study in electric ride-hailing pricing demonstrates robustness to lower-level approximation errors and validates the approach in a realistic setting.
Abstract
We consider the problem of efficiently learning to play single-leader multi-follower Stackelberg games when the leader lacks knowledge of the lower-level game. Such games arise in hierarchical decision-making problems involving self-interested agents. For example, in electric ride-hailing markets, a central authority aims to learn optimal charging prices to shape fleet distributions and charging patterns of ride-hailing companies. Existing works typically apply gradient-based methods to find the leader's optimal strategy. Such methods are impractical as they require that the followers share private utility information with the leader. Instead, we treat the lower-level game as a black box, assuming only that the followers' interactions approximate a Nash equilibrium while the leader observes the realized cost of the resulting approximation. Under kernel-based regularity assumptions on the leader's cost function, we develop a no-regret algorithm that converges to an $ε$-Stackelberg equilibrium in $O(\sqrt{T})$ rounds. Finally, we validate our approach through a numerical case study on optimal pricing in electric ride-hailing markets.
