The Hidden Game Problem
Gon Buzaglo, Noah Golowich, Elad Hazan
TL;DR
The paper addresses learning in enormous two-player games with a sparse, consistently dominant hidden set of actions $R$ in which payoffs satisfy $A = A_0 + \rho A_1$ and $A_0(i,j)=1$ for $i\in R$. It introduces the hidden game problem and shows how to design regret-minimization algorithms that keep external regret low in any game while achieving sublinear swap regret when the hidden structure exists. The main technical contributions are a swap-regret minimization scheme that incrementally uncovers $R$ with a bound of $O(\sqrt{T r^3 \log r})$ and a simultaneous external+swap regret framework (via Hedge, Follow-the-Perturbed-Leader with a smooth oracle, and fixed-point updates) that achieves external regret $O(\sqrt{T \log N})$ and swap regret $O(\sqrt{T r^3 \log r})$, with per-round runtime poly$(T)$ and independent of $N$. This yields rapid convergence to correlated equilibria in hidden subgames while preserving rationality in the full game, enabling scalable exploitation of sparse structure in AI alignment and language-game contexts.
Abstract
This paper investigates a class of games with large strategy spaces, motivated by challenges in AI alignment and language games. We introduce the hidden game problem, where for each player, an unknown subset of strategies consistently yields higher rewards compared to the rest. The central question is whether efficient regret minimization algorithms can be designed to discover and exploit such hidden structures, leading to equilibrium in these subgames while maintaining rationality in general. We answer this question affirmatively by developing a composition of regret minimization techniques that achieve optimal external and swap regret bounds. Our approach ensures rapid convergence to correlated equilibria in hidden subgames, leveraging the hidden game structure for improved computational efficiency.
