A Framework for Finding Local Saddle Points in Two-Player Zero-Sum Black-Box Games
Shubhankar Agarwal, Hamzah I. Khan, Sandeep P. Chinchali, David Fridovich-Keil
TL;DR
This work tackles the challenge of finding local saddle points in unknown, nonconvex-nonconcave two-player zero-sum games using only zeroth-order samples. It introduces a two-level Bayesian optimization framework: a high-level GP surrogate refines the unknown objective by sampling at carefully chosen points, and a low-level general-sum game on the GP model identifies local Nash points to guide sampling. The authors develop LLGame, a Newton-based solver for the low-level game, and BSP, a high-level procedure that iteratively samples and updates the GP until a local saddle point is certified via first- and second-order conditions, with multiple variants to balance exploration, exploitation, and sampling cost. Experiments on synthetic benchmarks and ARIMA-MPC settings show the approach can outperform baselines and provide robustness advantages, including improved out-of-distribution performance in a robust MPC context. The framework offers a flexible, extensible template for black-box saddle-point optimization with zeroth-order data, highlighting both practical utility and avenues for future theoretical and scalability enhancements.
Abstract
Saddle point optimization is a critical problem employed in numerous real-world applications, including portfolio optimization, generative adversarial networks, and robotics. It has been extensively studied in cases where the objective function is known and differentiable. Existing work in black-box settings with unknown objectives that can only be sampled either assumes convexity-concavity in the objective to simplify the problem or operates with noisy gradient estimators. In contrast, we introduce a framework inspired by Bayesian optimization which utilizes Gaussian processes to model the unknown (potentially nonconvex-nonconcave) objective and requires only zeroth-order samples. Our approach frames the saddle point optimization problem as a two-level process which can flexibly integrate existing and novel approaches to this problem. The upper level of our framework produces a model of the objective function by sampling in promising locations, and the lower level of our framework uses the existing model to frame and solve a general-sum game to identify locations to sample. This lower level procedure can be designed in complementary ways, and we demonstrate the flexibility of our approach by introducing variants which appropriately trade off between factors like runtime, the cost of function evaluations, and the number of available initial samples. We experimentally demonstrate these algorithms on synthetic and realistic datasets in black-box nonconvex-nonconcave settings, showcasing their ability to efficiently locate local saddle points in these contexts.
