Table of Contents
Fetching ...

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

Saman Kazemkhani, Aarav Pandya, Daphne Cornelisse, Brennan Shacklett, Eugene Vinitsky

TL;DR

GPUDrive tackles the bottleneck of learning-based multi-agent planning by delivering a GPU-accelerated, data-driven driving simulator capable of over a million steps per second. By leveraging the Madrona engine, BVH-based collision culling, and polyline decimation, it scales to hundreds of worlds with hundreds of agents while providing LiDAR and human-view sensor modalities and Python access for RL/IL workflows. The authors demonstrate training RL agents on the Waymo Open Motion Dataset, achieving high goal-reaching rates in minutes and scaling to thousands of scenarios in hours, with substantial end-to-end speedups over CPU-based or less scalable simulators. They also open-source the simulator, baselines, and training loops to foster reproducibility and broader research in data-driven, multi-agent autonomous driving. This work enables rapid experimentation and evaluation of autonomous driving planners under diverse, complex multi-agent interactions.

Abstract

Multi-agent learning algorithms have been successful at generating superhuman planning in various games but have had limited impact on the design of deployed multi-agent planners. A key bottleneck in applying these techniques to multi-agent planning is that they require billions of steps of experience. To enable the study of multi-agent planning at scale, we present GPUDrive. GPUDrive is a GPU-accelerated, multi-agent simulator built on top of the Madrona Game Engine capable of generating over a million simulation steps per second. Observation, reward, and dynamics functions are written directly in C++, allowing users to define complex, heterogeneous agent behaviors that are lowered to high-performance CUDA. Despite these low-level optimizations, GPUDrive is fully accessible through Python, offering a seamless and efficient workflow for multi-agent, closed-loop simulation. Using GPUDrive, we train reinforcement learning agents on the Waymo Open Motion Dataset, achieving efficient goal-reaching in minutes and scaling to thousands of scenarios in hours. We open-source the code and pre-trained agents at https://github.com/Emerge-Lab/gpudrive.

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

TL;DR

GPUDrive tackles the bottleneck of learning-based multi-agent planning by delivering a GPU-accelerated, data-driven driving simulator capable of over a million steps per second. By leveraging the Madrona engine, BVH-based collision culling, and polyline decimation, it scales to hundreds of worlds with hundreds of agents while providing LiDAR and human-view sensor modalities and Python access for RL/IL workflows. The authors demonstrate training RL agents on the Waymo Open Motion Dataset, achieving high goal-reaching rates in minutes and scaling to thousands of scenarios in hours, with substantial end-to-end speedups over CPU-based or less scalable simulators. They also open-source the simulator, baselines, and training loops to foster reproducibility and broader research in data-driven, multi-agent autonomous driving. This work enables rapid experimentation and evaluation of autonomous driving planners under diverse, complex multi-agent interactions.

Abstract

Multi-agent learning algorithms have been successful at generating superhuman planning in various games but have had limited impact on the design of deployed multi-agent planners. A key bottleneck in applying these techniques to multi-agent planning is that they require billions of steps of experience. To enable the study of multi-agent planning at scale, we present GPUDrive. GPUDrive is a GPU-accelerated, multi-agent simulator built on top of the Madrona Game Engine capable of generating over a million simulation steps per second. Observation, reward, and dynamics functions are written directly in C++, allowing users to define complex, heterogeneous agent behaviors that are lowered to high-performance CUDA. Despite these low-level optimizations, GPUDrive is fully accessible through Python, offering a seamless and efficient workflow for multi-agent, closed-loop simulation. Using GPUDrive, we train reinforcement learning agents on the Waymo Open Motion Dataset, achieving efficient goal-reaching in minutes and scaling to thousands of scenarios in hours. We open-source the code and pre-trained agents at https://github.com/Emerge-Lab/gpudrive.
Paper Structure (32 sections, 4 equations, 8 figures, 2 tables)

This paper contains 32 sections, 4 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Extremely fast multi-agent simulation with GPUDrive. Top: Bird's-eye view of Waymo Open Motion Dataset scenarios in GPUDrive, with boxes marking controlled agents and circles denoting their goals. Bottom: Corresponding agent views, centered on one agent. Observations can be easily configured based on the user's objectives. Here, agents are provided with a scene view through a relative coordinate frame. Shown are nearby road points within a configurable radius (set to 50 meters) and the relative positions of other agents in the scene.
  • Figure 2: Example scenarios from the Waymo Open Motion Dataset rendered in GPUDrive. The blue boxes and circles indicate agents and their respective destinations.
  • Figure 3: Peak goodput of GPUDrive on a consumer-grade and datacenter-class GPU compared to original, CPU and GPU-based, implementations using the radial filter observation. Left: The total number of agent steps per second (ASPS) is the number of objects for which our system computes observations at each time step. To ensure a fair comparison, we align the conditions with those used in gulino2024waymax, where all cars, bicyclists, and pedestrians are considered valid experience-generating agents. Center: The distribution of controllable agents across 512 scenarios in the Waymo Open Motion Dataset $\mu \approx 10.8 \, ( \text{red line} ), \sigma \approx 9.3$). These numbers are obtained using the "nontrivial" initialization mode in GPUDrive, which initializes only the agents that are more than 2 meters away from their final position. Right: The total number of controllable agent steps per second (CASPS) as we increase the number of worlds (parallelism).
  • Figure 4: ASPS and CASPS comparison between Radial Filter and LiDAR observation types. The radial filter is slower than lidar due to its linear scan of nearby objects. In contrast, the LiDAR observation type is GPU-accelerated, delivering significantly enhanced performance. As demonstrated in the plots, depending on the scene selection, the LiDAR can achieve a speedup of approximately 3x over the radial filter.
  • Figure 5: From hours to seconds.Left: Training performance (goal-reaching rate) as a function of the global step of the controlled agents (CASPS). Center: Training performance as a function of wall-clock time in seconds. Right: Training performance as a function of wall-clock time where the x-axis is on a log scale. Runs are averaged across three seeds, replicating environmental and experimental conditions as closely as possible. See Appendix \ref{['sec:training_details']} for the hyperparameters and training details. The green dotted line marks optimal performance (all agents reach their goal without collisions).
  • ...and 3 more figures