Table of Contents
Fetching ...

Driving is a Game: Combining Planning and Prediction with Bayesian Iterative Best Response

Aron Distelzweig, Yiwei Wang, Faris Janjoš, Marcel Hallgarten, Mihai Dobre, Alexander Langmann, Joschka Boedecker, Johannes Betz

TL;DR

BIBeR introduces a principled framework that unifies state-of-the-art motion prediction with game-theoretic planning through Bayesian Iterative Best Response. By re-weighting a diverse set of ego and surrounding trajectories within an IBR loop and modulating updates with a Bayesian confidence score, BIBeR achieves interaction-aware planning that can both react to and influence other drivers. The approach integrates marginals from modern predictors (LAformer) with a sampling-based planner (SPDM) and demonstrates strong gains on interactive benchmarks like interPlan lane-change, as well as robust performance on standard nuPlan tasks. Empirical results show BIBeR and its CV-filtered variant outperform baselines in highly interactive scenarios, with notable improvements in safety-related metrics and planning efficiency, while ablations highlight the importance of update order and the value of early iterations. The work also provides a thorough analysis of predictor choices (marginal vs joint) and runtime implications, arguing for a flexible, modular, and principled interaction-aware planning paradigm with practical potential for real-time deployment.

Abstract

Autonomous driving planning systems perform nearly perfectly in routine scenarios using lightweight, rule-based methods but still struggle in dense urban traffic, where lane changes and merges require anticipating and influencing other agents. Modern motion predictors offer highly accurate forecasts, yet their integration into planning is mostly rudimental: discarding unsafe plans. Similarly, end-to-end models offer a one-way integration that avoids the challenges of joint prediction and planning modeling under uncertainty. In contrast, game-theoretic formulations offer a principled alternative but have seen limited adoption in autonomous driving. We present Bayesian Iterative Best Response (BIBeR), a framework that unifies motion prediction and game-theoretic planning into a single interaction-aware process. BIBeR is the first to integrate a state-of-the-art predictor into an Iterative Best Response (IBR) loop, repeatedly refining the strategies of the ego vehicle and surrounding agents. This repeated best-response process approximates a Nash equilibrium, enabling bidirectional adaptation where the ego both reacts to and shapes the behavior of others. In addition, our proposed Bayesian confidence estimation quantifies prediction reliability and modulates update strength, more conservative under low confidence and more decisive under high confidence. BIBeR is compatible with modern predictors and planners, combining the transparency of structured planning with the flexibility of learned models. Experiments show that BIBeR achieves an 11% improvement over state-of-the-art planners on highly interactive interPlan lane-change scenarios, while also outperforming existing approaches on standard nuPlan benchmarks.

Driving is a Game: Combining Planning and Prediction with Bayesian Iterative Best Response

TL;DR

BIBeR introduces a principled framework that unifies state-of-the-art motion prediction with game-theoretic planning through Bayesian Iterative Best Response. By re-weighting a diverse set of ego and surrounding trajectories within an IBR loop and modulating updates with a Bayesian confidence score, BIBeR achieves interaction-aware planning that can both react to and influence other drivers. The approach integrates marginals from modern predictors (LAformer) with a sampling-based planner (SPDM) and demonstrates strong gains on interactive benchmarks like interPlan lane-change, as well as robust performance on standard nuPlan tasks. Empirical results show BIBeR and its CV-filtered variant outperform baselines in highly interactive scenarios, with notable improvements in safety-related metrics and planning efficiency, while ablations highlight the importance of update order and the value of early iterations. The work also provides a thorough analysis of predictor choices (marginal vs joint) and runtime implications, arguing for a flexible, modular, and principled interaction-aware planning paradigm with practical potential for real-time deployment.

Abstract

Autonomous driving planning systems perform nearly perfectly in routine scenarios using lightweight, rule-based methods but still struggle in dense urban traffic, where lane changes and merges require anticipating and influencing other agents. Modern motion predictors offer highly accurate forecasts, yet their integration into planning is mostly rudimental: discarding unsafe plans. Similarly, end-to-end models offer a one-way integration that avoids the challenges of joint prediction and planning modeling under uncertainty. In contrast, game-theoretic formulations offer a principled alternative but have seen limited adoption in autonomous driving. We present Bayesian Iterative Best Response (BIBeR), a framework that unifies motion prediction and game-theoretic planning into a single interaction-aware process. BIBeR is the first to integrate a state-of-the-art predictor into an Iterative Best Response (IBR) loop, repeatedly refining the strategies of the ego vehicle and surrounding agents. This repeated best-response process approximates a Nash equilibrium, enabling bidirectional adaptation where the ego both reacts to and shapes the behavior of others. In addition, our proposed Bayesian confidence estimation quantifies prediction reliability and modulates update strength, more conservative under low confidence and more decisive under high confidence. BIBeR is compatible with modern predictors and planners, combining the transparency of structured planning with the flexibility of learned models. Experiments show that BIBeR achieves an 11% improvement over state-of-the-art planners on highly interactive interPlan lane-change scenarios, while also outperforming existing approaches on standard nuPlan benchmarks.

Paper Structure

This paper contains 39 sections, 14 equations, 9 figures, 12 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the proposed Bayesian Iterative Best Response (BIBeR) framework. (i) the framework first generates a set of candidate trajectories for the ego vehicle, where the proposals are uniformly distributed. (ii) prediction model forecasts the independent future motions of surrounding agents. (iii) Bayesian confidence estimation balances assertiveness and conservatism in BIBeR: for each non-ego agent, a confidence term measures how well the updated trajectory matches observed motion versus the original prediction, directly guiding the weight update. (iv) Iterative Best Response (IBR) procedure refines these predictions: each agent, including the ego, iteratively re-weights its fixed set of candidate trajectories to optimally respond to the others, thereby approximating a game-theoretic equilibrium.
  • Figure 2: Prediction error at a $0.1$s horizon for LAformer and constant-velocity under reactive (R) and SMART-reactive (SR) settings. Left: Errors on the Test14-hard benchmark. Right: Errors on the interPlanLC benchmark.
  • Figure 3: Comparison of nuPlan subscores with confidence estimation enabled (y-axis) versus disabled (x-axis). Left: evaluation on interPlanLC SMART reactive (SR) benchmark. Right: evaluation on Test14-random SMART reactive (SR) benchmark. The evaluation uses BiBeR with LAformer as the prediction model. Deviations from the diagonal indicate changes in performance. Other evaluation metrics are omitted, as they show no change in performance and are not relevant to this analysis.
  • Figure 4: Top: highest-scored trajectory per agent from initial distribution before IBR. Bottom: highest-scored trajectory per agent from updated distributions after $k=10$ iterations of IBR. Several scenarios require a lane change to reach the goal or to maintain progress. BIBeR consistently identifies feasible gaps to merge into. For instance, in Scenario 1, the initial trajectory distributions do not permit a safe merge into the adjacent lane. After BIBeR updates the distributions, the ego vehicle infers that the following agent will brake in response to its lane change, enabling a safe maneuver. A similar pattern appears in Scenario 2, where BIBeR anticipates a slight deceleration from the following agent, allowing the ego vehicle to execute a lane change and to accelerate more aggressively compared to the initial trajectories. In Scenario 3, the model handles a particularly challenging lane change: BIBeR accurately predicts the follower’s yielding behavior, making the merge possible, something that a standard prediction-then-planning pipeline would fail to achieve. Comparable behavior can be observed in Scenario 4 and Scenario 5, where BIBeR refines the interaction between agents to facilitate safer and more coordinated maneuvers. BIBeR did not cause a collision in any of these scenarios. Ego agent and trajectory are shown in red, while surrounding agents and corresponding trajectories are shown in blue.
  • Figure 5: Runtime in seconds for the Iterative Best Response procedure across different numbers of iterations. Proposal generation and prediction are excluded from the measurement.
  • ...and 4 more figures