Learning to Control Unknown Strongly Monotone Games

Siddharth Chandak; Ilai Bistritz; Nicholas Bambos

Learning to Control Unknown Strongly Monotone Games

Siddharth Chandak, Ilai Bistritz, Nicholas Bambos

TL;DR

This work tackles the problem of steering the unique Nash equilibrium of an unknown, strongly monotone game toward satisfying linear constraints without disclosing players' reward structures. It introduces a two-time-scale online algorithm where the manager adjusts control inputs based on constraint-violation feedback while players update their actions using gradient-based learning in the presence of noise. The authors prove almost-sure convergence to the set of constraint-satisfying equilibria and establish a finite-time mean-square convergence rate near $t^{-1/4}$ under standard step-size conditions. The proposed method preserves user privacy and is applicable to applications like quadratic global objectives and weighted resource allocation, offering a principled, scalable mechanism for efficient equilibria in large networks.

Abstract

Consider a game where the players' utility functions include a reward function and a linear term for each dimension, with coefficients that are controlled by the manager. We assume that the game is strongly monotone, so gradient play converges to a unique Nash equilibrium (NE). The NE is typically globally inefficient. The global performance at NE can be improved by imposing linear constraints on the NE. We therefore want the manager to pick the controlled coefficients that impose the desired constraint on the NE. However, this requires knowing the players' reward functions and action sets. Obtaining this game information is infeasible in a large-scale network and violates user privacy. To overcome this, we propose a simple algorithm that learns to shift the NE to meet the linear constraints by adjusting the controlled coefficients online. Our algorithm only requires the linear constraints violation as feedback and does not need to know the reward functions or the action sets. We prove that our algorithm converges with probability 1 to the set of NE that satisfy target linear constraints. We then prove an L2 convergence rate of near-$O(t^{-1/4})$.

Learning to Control Unknown Strongly Monotone Games

TL;DR

under standard step-size conditions. The proposed method preserves user privacy and is applicable to applications like quadratic global objectives and weighted resource allocation, offering a principled, scalable mechanism for efficient equilibria in large networks.

Abstract

Paper Structure (17 sections, 8 theorems, 67 equations, 3 figures, 1 algorithm)

This paper contains 17 sections, 8 theorems, 67 equations, 3 figures, 1 algorithm.

Introduction
Related Work
Outline and Notation
Problem Formulation
Applications
Quadratic Global Objective
Weighted Resource Allocation Games
Game Control Algorithm
Performance Guarantees
Convergence Analysis
Non-expansive Mapping
Finite Time Convergence Guarantees
Simulation Results
Conclusion
Proofs from Section \ref{['sec:Algorithm']}
...and 2 more sections

Key Result

Theorem 1

Under Assumptions 1-5, $\boldsymbol{\alpha}_t$ converges to the set $\mathcal{N}_{opt}$, $\boldsymbol{x}_t$ converges to $\boldsymbol{x^*}(\boldsymbol{\alpha}_t)$, and with probability 1.

Figures (3)

Figure 1: Control of an unknown game
Figure 2: Weighted Load Balancing in Resource Allocation Games
Figure 3: Quadratic Global Objective

Theorems & Definitions (16)

Definition 1
Theorem 1
Theorem 2
Lemma 3
proof
Definition 2
Lemma 4
proof
Lemma 5
proof
...and 6 more

Learning to Control Unknown Strongly Monotone Games

TL;DR

Abstract

Learning to Control Unknown Strongly Monotone Games

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (16)