Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games
Yi Feng, Kaito Fujii, Stratis Skoulakis, Xiao Wang, Volkan Cevher
TL;DR
This work provides a continuous-time analysis of heavy-ball momentum in min-max games, examining both simultaneous and alternating update schemes near local Nash equilibria. The authors show that, unlike in minimization, smaller momentum can widen the local convergence region and alternating updates often enhance convergence, while implicit gradient regularization drives trajectories toward shallower gradient regions with this setup. Theoretical results are complemented by numerical experiments on 2D functions and GANs, validating the local convergence and stability benefits of smaller momentum and alternating updates. Overall, the study reveals fundamental differences between HB dynamics in min-max games and minimization, offering guidance for designing more stable min-max optimization algorithms. The findings have practical implications for training GANs and other adversarial models where stability and convergence are critical.
Abstract
Since Polyak's pioneering work, heavy ball (HB) momentum has been widely studied in minimization. However, its role in min-max games remains largely unexplored. As a key component of practical min-max algorithms like Adam, this gap limits their effectiveness. In this paper, we present a continuous-time analysis for HB with simultaneous and alternating update schemes in min-max games. Locally, we prove smaller momentum enhances algorithmic stability by enabling local convergence across a wider range of step sizes, with alternating updates generally converging faster. Globally, we study the implicit regularization of HB, and find smaller momentum guides algorithms trajectories towards shallower slope regions of the loss landscapes, with alternating updates amplifying this effect. Surprisingly, all these phenomena differ from those observed in minimization, where larger momentum yields similar effects. Our results reveal fundamental differences between HB in min-max games and minimization, and numerical experiments further validate our theoretical results.
