Table of Contents
Fetching ...

Inferring Foresightedness in Dynamic Noncooperative Games

Cade Armstrong, Ryan Park, Xinjie Liu, Kushagra Gupta, David Fridovich-Keil

TL;DR

This work tackles inferring agents' foresightedness in dynamic noncooperative games by formulating the problem as inverse dynamics with time-discounted costs $J^i=\ extstyle\sum_t(\gamma^i)^t C^i$ and transforming equilibrium conditions into a parametric MiCP. A gradient-based method leverages directional differentiability of MiCP solutions to estimate hidden parameters $\theta$ and discount factors $\gamma$ from online, potentially partial observations. Across synthetic crosswalks, real-world InD intersections, and Waymax-based simulations, the approach achieves markedly more accurate trajectory predictions than baselines, including up to 33% improvement in model fidelity and better robustness to noise. The results demonstrate the practical value of foresight-aware inverse game modeling for safer and more efficient multi-agent planning in human-robot and autonomous-vehicle settings, with avenues for extending to neural cost models, closed-loop information structures, and state-dependent discounting.

Abstract

Dynamic game theory is an increasingly popular tool for modeling multi-agent, e.g. human-robot, interactions. Game-theoretic models presume that each agent wishes to minimize a private cost function that depends on others' actions. These games typically evolve over a fixed time horizon, specifying how far into the future each agent plans. In practical settings, however, decision-makers may vary in foresightedness, or how much they care about their current cost in relation to their past and future costs. We conjecture that quantifying and estimating each agent's foresightedness from online data will enable safer and more efficient interactions with other agents. To this end, we frame this inference problem as an inverse dynamic game. We consider a specific objective function parametrization that smoothly interpolates myopic and farsighted planning. Games of this form are readily transformed into parametric mixed complementarity problems; we exploit the directional differentiability of solutions to these problems with respect to their hidden parameters to solve for agents' foresightedness. We conduct three experiments: one with synthetically generated delivery robot motion, one with real-world data involving people walking, biking, and driving vehicles, and one using high-fidelity simulators. The results of these experiments demonstrate that explicitly inferring agents' foresightedness enables game-theoretic models to make 33% more accurate models for agents' behavior.

Inferring Foresightedness in Dynamic Noncooperative Games

TL;DR

This work tackles inferring agents' foresightedness in dynamic noncooperative games by formulating the problem as inverse dynamics with time-discounted costs and transforming equilibrium conditions into a parametric MiCP. A gradient-based method leverages directional differentiability of MiCP solutions to estimate hidden parameters and discount factors from online, potentially partial observations. Across synthetic crosswalks, real-world InD intersections, and Waymax-based simulations, the approach achieves markedly more accurate trajectory predictions than baselines, including up to 33% improvement in model fidelity and better robustness to noise. The results demonstrate the practical value of foresight-aware inverse game modeling for safer and more efficient multi-agent planning in human-robot and autonomous-vehicle settings, with avenues for extending to neural cost models, closed-loop information structures, and state-dependent discounting.

Abstract

Dynamic game theory is an increasingly popular tool for modeling multi-agent, e.g. human-robot, interactions. Game-theoretic models presume that each agent wishes to minimize a private cost function that depends on others' actions. These games typically evolve over a fixed time horizon, specifying how far into the future each agent plans. In practical settings, however, decision-makers may vary in foresightedness, or how much they care about their current cost in relation to their past and future costs. We conjecture that quantifying and estimating each agent's foresightedness from online data will enable safer and more efficient interactions with other agents. To this end, we frame this inference problem as an inverse dynamic game. We consider a specific objective function parametrization that smoothly interpolates myopic and farsighted planning. Games of this form are readily transformed into parametric mixed complementarity problems; we exploit the directional differentiability of solutions to these problems with respect to their hidden parameters to solve for agents' foresightedness. We conduct three experiments: one with synthetically generated delivery robot motion, one with real-world data involving people walking, biking, and driving vehicles, and one using high-fidelity simulators. The results of these experiments demonstrate that explicitly inferring agents' foresightedness enables game-theoretic models to make 33% more accurate models for agents' behavior.

Paper Structure

This paper contains 21 sections, 12 equations, 7 figures, 1 algorithm.

Figures (7)

  • Figure 1: Representative example showcasing how the proposed formulation improves our ability to model agents' behavior in noncooperative interactions. Here, two simulated delivery robots are crossing an intersection, and by correctly inferring their degree of foresightedness (i.e., the $\gamma$ parameter in the inset), our method recovers their trajectories than a baseline approach.
  • Figure 2: Intersection from the InD dataset inDdataset, overlaid with trajectories for all four agents generated in a receding horizon fashion using our method and the baseline. Note that we reduced the opacity of the complete foresighted rollout after $t$ = 15 for visual clarity.
  • Figure 3: Snapshots from receding horizon inference and planning with the Waymax simulator, for a game involving the blue, green, yellow, and red agents. The ego robot (blue) employs our method, while the other cars are controlled by the Waymax simulator.
  • Figure 4: Heatmap of $\mathcal{P}$ from \ref{['prb:InverseProblemStatement']}for the crosswalk experiment for the fully observable (left) and partially observable (right) cases. Both players' $\gamma$ are varied while holding $\theta$ constant at the ground truth values and the resulting cost is scaled to be in between $[0, 1]$.
  • Figure 5: \ref{['alg:1']} reliably improves predicted trajectory error in the crosswalk experiment, for partial and full state observations. Solid/dotted lines denote means and the opaque band indicates standard deviation.
  • ...and 2 more figures