Table of Contents
Fetching ...

HAD-Gen: Human-like and Diverse Driving Behavior Modeling for Controllable Scenario Generation

Cheng Wang, Lingxin Kong, Massimiliano Tamborski, Stefano V. Albrecht

TL;DR

HAD-Gen addresses the challenge of generating realistic, diverse, and controllable driving scenarios for autonomous vehicle testing by clustering naturalistic driving data into risk-based styles, learning per-cluster reward functions with MaxEnt IRL, and training policies through offline RL followed by MARL with CTDE. The approach yields diverse human-like behaviors and strong generalization, outperforming baselines in goal-reaching while maintaining safety across unseen scenarios. Key findings include the superiority of self-replay MARL over log-replay and offline-only methods, and the benefit of per-style reward functions to capture different driving preferences. This framework enables more realistic AV safety validation and provides a path toward explainable, scenario-driven testing in real-world traffic environments.

Abstract

Simulation-based testing has emerged as an essential tool for verifying and validating autonomous vehicles (AVs). However, contemporary methodologies, such as deterministic and imitation learning-based driver models, struggle to capture the variability of human-like driving behavior. Given these challenges, we propose HAD-Gen, a general framework for realistic traffic scenario generation that simulates diverse human-like driving behaviors. The framework first clusters the vehicle trajectory data into different driving styles according to safety features. It then employs maximum entropy inverse reinforcement learning on each of the clusters to learn the reward function corresponding to each driving style. Using these reward functions, the method integrates offline reinforcement learning pre-training and multi-agent reinforcement learning algorithms to obtain general and robust driving policies. Multi-perspective simulation results show that our proposed scenario generation framework can simulate diverse, human-like driving behaviors with strong generalization capability. The proposed framework achieves a 90.96% goal-reaching rate, an off-road rate of 2.08%, and a collision rate of 6.91% in the generalization test, outperforming prior approaches by over 20% in goal-reaching performance. The source code is released at https://github.com/RoboSafe-Lab/Sim4AD.

HAD-Gen: Human-like and Diverse Driving Behavior Modeling for Controllable Scenario Generation

TL;DR

HAD-Gen addresses the challenge of generating realistic, diverse, and controllable driving scenarios for autonomous vehicle testing by clustering naturalistic driving data into risk-based styles, learning per-cluster reward functions with MaxEnt IRL, and training policies through offline RL followed by MARL with CTDE. The approach yields diverse human-like behaviors and strong generalization, outperforming baselines in goal-reaching while maintaining safety across unseen scenarios. Key findings include the superiority of self-replay MARL over log-replay and offline-only methods, and the benefit of per-style reward functions to capture different driving preferences. This framework enables more realistic AV safety validation and provides a path toward explainable, scenario-driven testing in real-world traffic environments.

Abstract

Simulation-based testing has emerged as an essential tool for verifying and validating autonomous vehicles (AVs). However, contemporary methodologies, such as deterministic and imitation learning-based driver models, struggle to capture the variability of human-like driving behavior. Given these challenges, we propose HAD-Gen, a general framework for realistic traffic scenario generation that simulates diverse human-like driving behaviors. The framework first clusters the vehicle trajectory data into different driving styles according to safety features. It then employs maximum entropy inverse reinforcement learning on each of the clusters to learn the reward function corresponding to each driving style. Using these reward functions, the method integrates offline reinforcement learning pre-training and multi-agent reinforcement learning algorithms to obtain general and robust driving policies. Multi-perspective simulation results show that our proposed scenario generation framework can simulate diverse, human-like driving behaviors with strong generalization capability. The proposed framework achieves a 90.96% goal-reaching rate, an off-road rate of 2.08%, and a collision rate of 6.91% in the generalization test, outperforming prior approaches by over 20% in goal-reaching performance. The source code is released at https://github.com/RoboSafe-Lab/Sim4AD.

Paper Structure

This paper contains 21 sections, 26 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The overall working flow of our proposed method. The driving behavior in a dataset is clustered into distinct driving styles. Each sub-dataset is then used to reconstruct a reward function via IRL. The reconstructed rewards are fundamental for offline RL and MARL, which generate a driving policy for each cluster. The various driving policies are deployed in a simulation based on a policy selection strategy.
  • Figure 2: Detailed overview of the proposed HAD-Gen methodology, including driving style recognition, MaxEnt IRL and policy training using MARL.
  • Figure 3: We use iTTC and THW to determine the risk level of a driving behavior. The thresholds are suggested in xue2019rapid.
  • Figure 4: Distribution of key metrics demonstrating the diversity of generated scenarios for different driving policies.
  • Figure 5: The driving policies corresponding to different driving styles show different driving behaviors in the same scenario.
  • ...and 1 more figures