An Online Learning Approach for Two-Player Zero-Sum Linear Quadratic Games
Shanting Wang, Weihao Sun, Andreas A. Malikopoulos
Abstract
In this paper, we present an online learning approach for two-player zero-sum linear quadratic games with unknown dynamics. We develop a framework combining regularized least squares model estimation, high probability confidence sets, and surrogate model selection to maintain a regular model for policy updates. We apply a shrinkage step at each episode to identify a surrogate model in the region where the generalized algebraic Riccati equation admits a stabilizing saddle point solution. We then establish regret analysis on algorithm convergence, followed by a numerical example to illustrate the convergence performance and verify the regret analysis.
