Table of Contents
Fetching ...

Planning with Adaptive World Models for Autonomous Driving

Arun Balajee Vasudevan, Neehar Peri, Jeff Schneider, Deva Ramanan

TL;DR

City-level driving styles vary significantly, indicating the need for adaptive planning. The paper introduces BehaviorNet, a GCNN that predicts IDM control parameters for surrounding agents, and AdaptiveDriver, an MPC-based planner that unrolls behavior-conditioned world models to guide planning. Key contributions include showing city-specific driving patterns, learning behavior priors that align with real logs, and achieving state-of-the-art results on the nuPlan reactive benchmark with cross-city generalization, all while maintaining practical latency. This work demonstrates a practical path to cross-city robustness by combining learning-based priors with rule-based planning in real-world driving data.

Abstract

Motion planning is crucial for safe navigation in complex urban environments. Historically, motion planners (MPs) have been evaluated with procedurally-generated simulators like CARLA. However, such synthetic benchmarks do not capture real-world multi-agent interactions. nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic, effectively turning the fixed dataset into a reactive simulator. We analyze the characteristics of nuPlan's recorded logs and find that each city has its own unique driving behaviors, suggesting that robust planners must adapt to different environments. We learn to model such unique behaviors with BehaviorNet, a graph convolutional neural network (GCNN) that predicts reactive agent behaviors using features derived from recently-observed agent histories; intuitively, some aggressive agents may tailgate lead vehicles, while others may not. To model such phenomena, BehaviorNet predicts the parameters of an agent's motion controller rather than directly predicting its spacetime trajectory (as most forecasters do). Finally, we present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions. Our extensive experiments demonstrate that AdaptiveDriver achieves state-of-the-art results on the nuPlan closed-loop planning benchmark, improving over prior work by 2% on Test-14 Hard R-CLS, and generalizes even when evaluated on never-before-seen cities.

Planning with Adaptive World Models for Autonomous Driving

TL;DR

City-level driving styles vary significantly, indicating the need for adaptive planning. The paper introduces BehaviorNet, a GCNN that predicts IDM control parameters for surrounding agents, and AdaptiveDriver, an MPC-based planner that unrolls behavior-conditioned world models to guide planning. Key contributions include showing city-specific driving patterns, learning behavior priors that align with real logs, and achieving state-of-the-art results on the nuPlan reactive benchmark with cross-city generalization, all while maintaining practical latency. This work demonstrates a practical path to cross-city robustness by combining learning-based priors with rule-based planning in real-world driving data.

Abstract

Motion planning is crucial for safe navigation in complex urban environments. Historically, motion planners (MPs) have been evaluated with procedurally-generated simulators like CARLA. However, such synthetic benchmarks do not capture real-world multi-agent interactions. nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic, effectively turning the fixed dataset into a reactive simulator. We analyze the characteristics of nuPlan's recorded logs and find that each city has its own unique driving behaviors, suggesting that robust planners must adapt to different environments. We learn to model such unique behaviors with BehaviorNet, a graph convolutional neural network (GCNN) that predicts reactive agent behaviors using features derived from recently-observed agent histories; intuitively, some aggressive agents may tailgate lead vehicles, while others may not. To model such phenomena, BehaviorNet predicts the parameters of an agent's motion controller rather than directly predicting its spacetime trajectory (as most forecasters do). Finally, we present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions. Our extensive experiments demonstrate that AdaptiveDriver achieves state-of-the-art results on the nuPlan closed-loop planning benchmark, improving over prior work by 2% on Test-14 Hard R-CLS, and generalizes even when evaluated on never-before-seen cities.
Paper Structure (6 sections, 1 equation, 6 figures, 6 tables)

This paper contains 6 sections, 1 equation, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Drivers Behave Differently in Different Cities. Using the nuPlan benchmark, we compare the distance between the ego-vehicle and lead agent (min-gap) in Pittsburgh (PIT), Boston (BOS), Las Vegas (LAS) and Singapore (SIN). Interestingly, we find that the min-gap distribution between cities differs dramatically. For example, Boston drivers are more aggressive (e.g. drive with lower average min-gap) than Pittsburgh drivers. This suggests that (a) robust planners must adapt to diverse driving conditions and (b) model-predictive control (MPC) planners may benefit from adaptive world models that capture such behaviors.
  • Figure 2: Visualizing Adaptive World Models. Each column visualizes the same initial traffic scenario unrolled with different world models (trained on agents from different cities). We visualize the ego-vehicle in yellow and other agents in gray. Blue lines represent the lane-graph. On the left, the BOS world model (top) produces agents that tend to tailgate more than the PIT world model, consistent with the min-gap statistics across those two datasets in Figure \ref{['fig:teaser']}. In the center, agents in the LAS world model have a higher max acceleration compared to those of the default IDM world model. On the right, agents in the SIN world model have a higher max acceleration and min-gap compared to the PIT world model. We demonstrate that such adaptive world models can be used to significantly improve the accuracy of model-predictive control (MPC) planners. To do so, we train BehaviorNet, a network that predicts parameters of an adaptive world model based on the recent observed history of agents in the scene.
  • Figure 3: BehaviorNet Architecture. BehaviorNet uses past trajectories and map context to predict future agent behaviors parameterized as IDM contorls. Following LaneGCN liang2020learning, we use a graph convolutional network (GCN) to extract map features from the lane graph. Next, we extract agent features from past trajectories. We then use LaneGCN's Agent-Map Feature Fusion to model interactions between agents and the map. Lastly, we pass these agent-map features through an MLP to predict IDM controls.
  • Figure 4: Visualizing Clusters of Behaviors. In (a), we optimize scenario-specific world models using Equation \ref{['eq:find_theta']} and visualize per-log IDM parameters using a tSNE plot, coloring cities differently. For comparison, we also plot per-city-optimized IDM parameters (Table \ref{['tab:idm_param']}) as large colored dots. In (b), we cluster per-scenario IDM parameters using $K$-means and visualize different clusters. Each cluster represents a unique emergent driving behavior. In (c), we compare the distribution of min-gap between the "aggressive driver" cluster (left) and "passive driver" cluster (right). We note that the "aggressive driver" cluster has a lower average min-gap than the "passive driver" cluster, validating that our $K$-means clusters encode unique city-agnostic driving behaviors.
  • Figure 5: Overview of AdaptiveDriver's Architecture. AdpativeDriver's architecture extends PDM-C by replacing "world-on-rails' rollouts with adaptive reactive world models using BehaviorNet. We predict future agent behaviors parameterized as IDM controls using scene context including the ego-vehicle's history, past agent trajectories, and surrounding lane graph. Similar to PDM-C, our planner identifies the nearest center-line to the goal using graph-based search. We generate many trajectory proposals and score each according to BehaviorNet's reactive world model. The proposal with the highest score is selected and executed.
  • ...and 1 more figures