Mean Field Correlated Imitation Learning

Zhiyu Zhao; Qirui Mi; Ning Yang; Xue Yan; Haifeng Zhang; Jun Wang; Yaodong Yang

Mean Field Correlated Imitation Learning

Zhiyu Zhao, Qirui Mi, Ning Yang, Xue Yan, Haifeng Zhang, Jun Wang, Yaodong Yang

TL;DR

This paper introduces Adaptive Mean Field Correlated Equilibrium ($AMFCE$) to extend mean-field game frameworks to scenarios with time-varying correlated signals, addressing limitations of $MFNE$ and existing MFCE models that assume fixed future signals. Building on this equilibrium, the authors propose Mean Field Correlated Imitation Learning ($MFCIL$), a GAN-based imitation-learning framework that recovers both the agent policy and the evolving correlation device from expert demonstrations. They prove existence of $AMFCE$, show that $MFNE$ is a special case of $AMFCE$, and derive finite-horizon, polynomial-in-$T$ bounds on imitation gaps, improving the tractability of practical MFG-IL. Empirically, $MFCIL$ outperforms state-of-the-art baselines on tasks including Squeeze, RPS, Flock, and real-world traffic flow and TaxAI simulations, illustrating robust recovery of correlated policies and improved population-level predictions. The work provides a principled, scalable approach for modeling and learning in large populations where external, time-varying signals influence collective behavior.

Abstract

We investigate multi-agent imitation learning (IL) within the framework of mean field games (MFGs), considering the presence of time-varying correlated signals. Existing MFG IL algorithms assume demonstrations are sampled from Mean Field Nash Equilibria (MFNE), limiting their adaptability to real-world scenarios. For example, in the traffic network equilibrium influenced by public routing recommendations, recommendations introduce time-varying correlated signals into the game, not captured by MFNE and other existing correlated equilibrium concepts. To address this gap, we propose Adaptive Mean Field Correlated Equilibrium (AMFCE), a general equilibrium incorporating time-varying correlated signals. We establish the existence of AMFCE under mild conditions and prove that MFNE is a subclass of AMFCE. We further propose Correlated Mean Field Imitation Learning (CMFIL), a novel IL framework designed to recover the AMFCE, accompanied by a theoretical guarantee on the quality of the recovered policy. Experimental results, including a real-world traffic flow prediction problem, demonstrate the superiority of CMFIL over state-of-the-art IL baselines, highlighting the potential of CMFIL in understanding large population behavior under correlated signals.

Mean Field Correlated Imitation Learning

TL;DR

This paper introduces Adaptive Mean Field Correlated Equilibrium (

) to extend mean-field game frameworks to scenarios with time-varying correlated signals, addressing limitations of

and existing MFCE models that assume fixed future signals. Building on this equilibrium, the authors propose Mean Field Correlated Imitation Learning (

), a GAN-based imitation-learning framework that recovers both the agent policy and the evolving correlation device from expert demonstrations. They prove existence of

, show that

is a special case of

, and derive finite-horizon, polynomial-in-

bounds on imitation gaps, improving the tractability of practical MFG-IL. Empirically,

outperforms state-of-the-art baselines on tasks including Squeeze, RPS, Flock, and real-world traffic flow and TaxAI simulations, illustrating robust recovery of correlated policies and improved population-level predictions. The work provides a principled, scalable approach for modeling and learning in large populations where external, time-varying signals influence collective behavior.

Abstract

Paper Structure (44 sections, 20 theorems, 42 equations, 1 figure, 8 tables, 1 algorithm)

This paper contains 44 sections, 20 theorems, 42 equations, 1 figure, 8 tables, 1 algorithm.

Introduction
Related work
Multi-agent Imitation Learning
Mean Field Equilibria Concepts
Preliminaries
Classic mean field Nash equilibrium
Imitation Learning
Problem formulation
Adaptive Mean Field Correlated Equilibrium
Difference between AMFCE and MFCE
Properties of AMFCE
Imitation learning for AMFCE
Experiments
Tasks
Squeeze
...and 29 more sections

Key Result

Theorem 4.5

If the reward functions $r(s,a,\mu)$ and transition kernel $P(s'|s, a, \mu)$ are bounded and continuous with respect to population state distribution $\mu$, there exists at least one AMFCE solution.

Figures (1)

Figure 1: The distribution of correlation device $\rho$ recovered by MFCIL. The solid line shows the mean and the shaded area represents the standard deviation over 3 independent runs. The dash line shows the ground truth of $\rho$.

Theorems & Definitions (28)

Definition 3.1: MFNE
Definition 4.1: Correlation Device
Definition 4.2: Behavioral Policy
Definition 4.3: AMFCE
Example 4.4
Theorem 4.5
Corollary 4.5
Definition 5.1: CIP
Proposition 5.1
Theorem 5.2
...and 18 more

Mean Field Correlated Imitation Learning

TL;DR

Abstract

Mean Field Correlated Imitation Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (28)