Table of Contents
Fetching ...

Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions

Zhouyu Zhang, Chih-Yuan Chiu, Glen Chou

TL;DR

This work tackles learning coupled, multi-agent constraints from demonstrations of local Nash interactions by formulating an inverse dynamic-game problem that enforces the forward-game KKT conditions. It shows how to recast constraint inference as MILP/MIBLP programs for offset- and affine-parameterized constraints, enabling inner approximations of the true safe and unsafe sets and providing volumes for robust motion planning under uncertainty. The authors demonstrate accurate constraint recovery and safe planning across double integrator, unicycle, and quadcopter dynamics in both simulation and hardware, and compare favorably against cost-inference baselines that fail to ensure safety. A key contribution is the combination of KKT-based inverse learning with volume-extraction planning, yielding provable safety guarantees and practical robustness to mis-specification and limited demonstrations.

Abstract

We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the local Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods accurately inferred constraints and designed safe interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.

Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions

TL;DR

This work tackles learning coupled, multi-agent constraints from demonstrations of local Nash interactions by formulating an inverse dynamic-game problem that enforces the forward-game KKT conditions. It shows how to recast constraint inference as MILP/MIBLP programs for offset- and affine-parameterized constraints, enabling inner approximations of the true safe and unsafe sets and providing volumes for robust motion planning under uncertainty. The authors demonstrate accurate constraint recovery and safe planning across double integrator, unicycle, and quadcopter dynamics in both simulation and hardware, and compare favorably against cost-inference baselines that fail to ensure safety. A key contribution is the combination of KKT-based inverse learning with volume-extraction planning, yielding provable safety guarantees and practical robustness to mis-specification and limited demonstrations.

Abstract

We present an inverse dynamic game-based algorithm to learn parametric constraints from a given dataset of local Nash equilibrium interactions between multiple agents. Specifically, we introduce mixed-integer linear programs (MILP) encoding the Karush-Kuhn-Tucker (KKT) conditions of the interacting agents, which recover constraints consistent with the local Nash stationarity of the interaction demonstrations. We establish theoretical guarantees that our method learns inner approximations of the true safe and unsafe sets. We also use the interaction constraints recovered by our method to design motion plans that robustly satisfy the underlying constraints. Across simulations and hardware experiments, our methods accurately inferred constraints and designed safe interactive motion plans for various classes of constraints, both convex and non-convex, from interaction demonstrations of agents with nonlinear dynamics.

Paper Structure

This paper contains 70 sections, 8 theorems, 120 equations, 14 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

(Conservativeness of Safe and Unsafe Set Recovery fromEqn: KKT, Inverse, Optimal) Define the learned set of guaranteed safe (resp., unsafe) trajectories, denoted by $\mathcal{G}_s(\mathcal{D})$ (resp., $\mathcal{G}_{\urcorner s}(\mathcal{D})$), to be the set of trajectories that are safe (resp., uns Then $\mathcal{G}_s(\mathcal{D}) \subseteq \mathcal{S}(\theta^\star)$ and $\mathcal{G}_{\urcorner s

Figures (14)

  • Figure 2: Learned constraint sets of Agent 1 (blue) and 2 (orange) with double integrator dynamics and (a) ellipsoidal, (b) polytopic, or (d) velocity-dependent spherical collision avoidance constraints. (c) Inner approximation of the safe set (from (b)) via volume extraction, and corresponding safe motion plans. (e, f) Motion plans designed using learned constraints from (d). In all subplots, solid circles at the ends of the trajectories indicate start and goal positions. In (a) and (b), solid squares indicate tight (i.e., activated) constraints. In (d), (e), and (f), red (resp., black) squares indicate Agent 1 (resp., 2) states corresponding to the gray velocity-dependent spherical constraints depicted.
  • Figure 3: (a) Demonstrations (delinated by linewidth) and (b) safe planning via volume extraction for hardware unicycle agents 1 (blue) and 2 (orange) with spherical collision-avoidance constraints. (c)-(d) present the box-constraint case; despite reduced recovery accuracy from demonstrator suboptimality, the volume-extraction planner still produces safe trajectories. In (a), squares on and dashed lines between trajectories indicate constraint activation (i.e., tightness).
  • Figure 4: Constraint learning and planning for Agent 1 (blue) and 2 (orange) with unicycle dynamics satisfying (a, b) proximity or (c, d) line of sight. Learned constraints (shaded green) coincide with the true constraints (dashed black lines).
  • Figure 5: Constraint learning and planning for Agents 1 (blue), 2 (orange), and 3 (yellow) with quadcopter dynamics satisfying spherical collision-avoidance constraints. (a) A demonstration of all agents, in absolute coordinates, interacting while satisfying the constraints. Our method exactly learns all constraint parameters. Filled squares on, and dashed lines between, trajectories indicate tight constraints. (b) Using our learned constraints, we generate safe motion plans via volume extraction over the trajectory space.
  • Figure 6: Our method well-approximates an ellipsoidal collision-avoidance set as the union of three boxes from interaction demonstrations with bicycle dynamics, even when constraints are mistakenly assumed to be box-parameterized.
  • ...and 9 more figures

Theorems & Definitions (17)

  • Theorem 1
  • proof
  • Remark 1
  • Remark 2
  • Theorem 2: Volume Extraction Over Trajectories
  • Theorem 3: Volume Extraction Over Parameter Space
  • Remark 3
  • Theorem 4: Limitations of Learnability
  • Theorem 5
  • proof
  • ...and 7 more