Coupled Wasserstein Gradient Flows for Min-Max and Cooperative Games
Lauren Conger, Franca Hoffmann, Eric Mazumdar, Lillian J. Ratliff
TL;DR
We address the problem of modeling two-player interactions over distributions via coupled Wasserstein gradient flows, formulating min-max and cooperative two-species PDEs that evolve in Wasserstein-2 space. The authors establish rigorous guarantees for existence and uniqueness of steady states and Nash equilibria, plus exponential convergence to these equilibria under displacement convexity/concavity, via an HWI-type inequality and a Danskin-type Gamma-convergence approach. They extend the theory to timescale-separated regimes in ML-driven distribution shift, providing explicit rates and a practical interpretation for how algorithmic updates interact with strategic populations. Numerical experiments on real data (Colombia census, loan applications) and performative prediction demonstrate distribution-level effects and the necessity of modeling intra-population interactions beyond mere moments. The results advance the understanding of infinite-dimensional game dynamics and offer a principled framework for analyzing distribution shift in ML systems with strategic agents.
Abstract
We propose a framework for two-player infinite-dimensional games with cooperative or competitive structure. These games take the form of coupled partial differential equations in which players optimize over a space of measures, driven by either a gradient descent or gradient descent-ascent in Wasserstein-2 space. We characterize the properties of the Nash equilibrium of the system, and relate it to the steady state of the dynamics. In the min-max setting, we show, under sufficient convexity conditions, that solutions converge exponentially fast and with explicit rate to the unique Nash equilibrium. Similar results are obtained for the cooperative setting. We apply this framework to distribution shift induced by interactions among a strategic population of agents and an algorithm, proving additional convergence results in the timescale-separated setting. We illustrate the performance of our model on (i) real data from an economics study on Colombia census data, (ii) feature modification in loan applications, and (iii) performative prediction. The numerical experiments demonstrate the importance of distribution-level, rather than moment-level, modeling.
