Learning and steering game dynamics towards desirable outcomes
Ilayda Canyakmaz, Iosif Sakos, Wayne Lin, Antonios Varvitsiotis, Georgios Piliouras
TL;DR
The work tackles steering evolving game dynamics toward desirable equilibria when the underlying update rules are unknown and data are scarce. It introduces SIAR-MPC, which couples Side Information Assisted Regression with Model Predictive Control by extending SIAR to learn controlled dynamics with constraints like robust forward invariance and positive correlation, and then applying MPC to compute dynamic incentives. Across coordination and zero-sum games, including chaotic regimes, SIAR-MPC achieves convergence to socially optimal equilibria and stabilizes oscillatory behavior using far fewer training samples than competing methods such as SINDYc and PINN-MPC. This data-efficient framework holds promise for real-time, constraint-aware policy design in strategic environments where model-free controllers would struggle with limited observations.
Abstract
Game dynamics, which describe how agents' strategies evolve over time based on past interactions, can exhibit a variety of undesirable behaviours including convergence to suboptimal equilibria, cycling, and chaos. While central planners can employ incentives to mitigate such behaviors and steer game dynamics towards desirable outcomes, the effectiveness of such interventions critically relies on accurately predicting agents' responses to these incentives -- a task made particularly challenging when the underlying dynamics are unknown and observations are limited. To address this challenge, this work introduces the Side Information Assisted Regression with Model Predictive Control (SIAR-MPC) framework. We extend the recently introduced SIAR method to incorporate the effect of control, enabling it to utilize side-information constraints inherent to game-theoretic applications to model agents' responses to incentives from scarce data. MPC then leverages this model to implement dynamic incentive adjustments. Our experiments demonstrate the effectiveness of SIAR-MPC in guiding systems towards socially optimal equilibria, stabilizing chaotic and cycling behaviors. Notably, it achieves these results in data-scarce settings of few learning samples, where well-known system identification methods paired with MPC show less effective results.
