Adversarial Inverse Reinforcement Learning for Mean Field Games
Yang Chen, Libo Zhang, Jiamou Liu, Michael Witbrock
TL;DR
This work tackles inverse reinforcement learning in mean-field games to model and design large-scale multi-agent systems under bounded rationality. It introduces Mean-Field Adversarial IRL (MF-AIRL), built on entropy-regularised MFNE (ERMFNE) and adversarial MaxEnt IRL, to recover reward signals from imperfect demonstrations. The authors derive a tractable extension of MaxEnt IRL to MFGs by substituting the empirical mean field and frame MF-AIRL as an adversarial learning problem with a discriminator and adaptive samplers, including a reward-shaping invariance via a potential function. Empirical results on five large-scale tasks show MF-AIRL outperforms centralised and decentralised baselines in reward recovery and robustness to changes in dynamics, demonstrating practical viability for environment design and behavior prediction in large MAS.
Abstract
Mean field games (MFGs) provide a mathematically tractable framework for modelling large-scale multi-agent systems by leveraging mean field theory to simplify interactions among agents. It enables applying inverse reinforcement learning (IRL) to predict behaviours of large populations by recovering reward signals from demonstrated behaviours. However, existing IRL methods for MFGs are powerless to reason about uncertainties in demonstrated behaviours of individual agents. This paper proposes a novel framework, Mean-Field Adversarial IRL (MF-AIRL), which is capable of tackling uncertainties in demonstrations. We build MF-AIRL upon maximum entropy IRL and a new equilibrium concept. We evaluate our approach on simulated tasks with imperfect demonstrations. Experimental results demonstrate the superiority of MF-AIRL over existing methods in reward recovery.
