HAD-Gen: Human-like and Diverse Driving Behavior Modeling for Controllable Scenario Generation
Cheng Wang, Lingxin Kong, Massimiliano Tamborski, Stefano V. Albrecht
TL;DR
HAD-Gen addresses the challenge of generating realistic, diverse, and controllable driving scenarios for autonomous vehicle testing by clustering naturalistic driving data into risk-based styles, learning per-cluster reward functions with MaxEnt IRL, and training policies through offline RL followed by MARL with CTDE. The approach yields diverse human-like behaviors and strong generalization, outperforming baselines in goal-reaching while maintaining safety across unseen scenarios. Key findings include the superiority of self-replay MARL over log-replay and offline-only methods, and the benefit of per-style reward functions to capture different driving preferences. This framework enables more realistic AV safety validation and provides a path toward explainable, scenario-driven testing in real-world traffic environments.
Abstract
Simulation-based testing has emerged as an essential tool for verifying and validating autonomous vehicles (AVs). However, contemporary methodologies, such as deterministic and imitation learning-based driver models, struggle to capture the variability of human-like driving behavior. Given these challenges, we propose HAD-Gen, a general framework for realistic traffic scenario generation that simulates diverse human-like driving behaviors. The framework first clusters the vehicle trajectory data into different driving styles according to safety features. It then employs maximum entropy inverse reinforcement learning on each of the clusters to learn the reward function corresponding to each driving style. Using these reward functions, the method integrates offline reinforcement learning pre-training and multi-agent reinforcement learning algorithms to obtain general and robust driving policies. Multi-perspective simulation results show that our proposed scenario generation framework can simulate diverse, human-like driving behaviors with strong generalization capability. The proposed framework achieves a 90.96% goal-reaching rate, an off-road rate of 2.08%, and a collision rate of 6.91% in the generalization test, outperforming prior approaches by over 20% in goal-reaching performance. The source code is released at https://github.com/RoboSafe-Lab/Sim4AD.
