Learning Two-agent Motion Planning Strategies from Generalized Nash Equilibrium for Model Predictive Control
Hansung Kim, Edward L. Zhu, Chang Seok Lim, Francesco Borrelli
TL;DR
This work tackles real-time two-agent motion planning by learning game-theoretic interaction outcomes from generalized Nash equilibrium data and embedding them as a terminal cost in model predictive control. The method, IGT-MPC, combines offline GNE data generation with a neural network that predicts GT rewards and online MPC that uses this predictor to implicitly account for other agents. It demonstrates two scenarios—competitive head-to-head racing and cooperative intersection navigation—where V_GT-guided MPC achieves higher feasibility, reduces gridlocks, and exhibits strategic behaviors compared to a naive progress-maximizing terminal cost. While effective for two agents, the authors acknowledge scalability and generalization challenges, proposing richer representations and faster solvers as avenues for future work.
Abstract
We introduce an Implicit Game-Theoretic MPC (IGT-MPC), a decentralized algorithm for two-agent motion planning that uses a learned value function that predicts the game-theoretic interaction outcomes as the terminal cost-to-go function in a model predictive control (MPC) framework, guiding agents to implicitly account for interactions with other agents and maximize their reward. This approach applies to competitive and cooperative multi-agent motion planning problems which we formulate as constrained dynamic games. Given a constrained dynamic game, we randomly sample initial conditions and solve for the generalized Nash equilibrium (GNE) to generate a dataset of GNE solutions, computing the reward outcome of each game-theoretic interaction from the GNE. The data is used to train a simple neural network to predict the reward outcome, which we use as the terminal cost-to-go function in an MPC scheme. We showcase emerging competitive and coordinated behaviors using IGT-MPC in scenarios such as two-vehicle head-to-head racing and un-signalized intersection navigation. IGT-MPC offers a novel method integrating machine learning and game-theoretic reasoning into model-based decentralized multi-agent motion planning.
