TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters
Jonathan Wilder Lavington, Ke Zhang, Vasileios Lioutas, Matthew Niedoba, Yunpeng Liu, Dylan Green, Saeid Naderiparizi, Xiaoxuan Liang, Setareh Dabiri, Adam Ścibior, Berend Zwartsenberg, Frank Wood
TL;DR
The work addresses the need for realistic, efficient, and adaptable simulators to train autonomous driving controllers under varied NPC behaviors. It introduces TorchDriveSim, a differentiable 2D driving simulator, and TorchDriveEnv, a Gym-compatible RL benchmark that integrates data-driven, reactive NPCs via an external API, with CARLA-based maps and train/validation splits. Evaluations of common RL baselines (SAC, PPO, A2C, TD3) reveal that multi-agent training is more challenging yet yields better generalization, while even strong policies incur infractions, underscoring the need for objective-aligned optimization. Overall, TorchDriveSim/Env provide a practical, extensible framework for robust AV controller development and pave the way for more realistic NPC behavior modeling and differentiable dynamics.
Abstract
The training, testing, and deployment, of autonomous vehicles requires realistic and efficient simulators. Moreover, because of the high variability between different problems presented in different autonomous systems, these simulators need to be easy to use, and easy to modify. To address these problems we introduce TorchDriveSim and its benchmark extension TorchDriveEnv. TorchDriveEnv is a lightweight reinforcement learning benchmark programmed entirely in Python, which can be modified to test a number of different factors in learned vehicle behavior, including the effect of varying kinematic models, agent types, and traffic control patterns. Most importantly unlike many replay based simulation approaches, TorchDriveEnv is fully integrated with a state of the art behavioral simulation API. This allows users to train and evaluate driving models alongside data driven Non-Playable Characters (NPC) whose initializations and driving behavior are reactive, realistic, and diverse. We illustrate the efficiency and simplicity of TorchDriveEnv by evaluating common reinforcement learning baselines in both training and validation environments. Our experiments show that TorchDriveEnv is easy to use, but difficult to solve.
