Exponential Lower Bounds on the Double Oracle Algorithm in Zero-Sum Games
Brian Hu Zhang, Tuomas Sandholm
TL;DR
The paper analyzes the plain double oracle algorithm for two-player zero-sum games, focusing on worst-case convergence. It proves exponential lower bounds on the number of iterations required in both POSGs and EFGs under adversarial or non-deterministic tie-breaking, using constructions like the $2^k$-bigger-number and $n$-bigger-number games mapped to POSGs. The results demonstrate that even compact instances with small Nash-support can force exponential runtime in the iteration count, highlighting fundamental limitations of the method. The discussion situates these findings relative to fictitious play and $\alpha$-best-response dynamics, and outlines directions for achieving polynomial guarantees or robust variants in future work.
Abstract
The double oracle algorithm is a popular method of solving games, because it is able to reduce computing equilibria to computing a series of best responses. However, its theoretical properties are not well understood. In this paper, we provide exponential lower bounds on the performance of the double oracle algorithm in both partially-observable stochastic games (POSGs) and extensive-form games (EFGs). Our results depend on what is assumed about the tiebreaking scheme -- that is, which meta-Nash equilibrium or best response is chosen, in the event that there are multiple to pick from. In particular, for EFGs, our lower bounds require adversarial tiebreaking, whereas for POSGs, our lower bounds apply regardless of how ties are broken.
