Table of Contents
Fetching ...

On-Demand Scenario Generation for Testing Automated Driving Systems

Songyang Yan, Xiaodong Zhang, Kunkun Hao, Haojie Xin, Yonggang Luo, Jucheng Yang, Ming Fan, Chao Yang, Jun Sun, Zijiang Yang

TL;DR

The paper tackles the challenge of evaluating Automated Driving Systems (ADS) across driving scenarios that vary in naturalness and risk. It introduces On-demand Scenario Generation (OSG), a framework that learns from real traffic data using a Naturalness Estimator and a traffic-prior model, and controls scenario risk with a Risk Intensity Regulator; scenario generation is guided by a balance between adversarial risk and naturalness through the objective $G(cpsi; comega, ctheta) = \left(\widetilde{ADV}(\u03cpsi)^{\omega^2} + \widetilde{NAT}(\u03cpsi;\theta)^{(1-\omega)^2}\right)^{e^{\omega(1-\omega)}}$. OSG organizes scenarios hierarchically as functional, logical, and concrete (FS/LS/CS), and uses a Speciation-based Particle Swarm Optimization to ensure diverse results. Evaluations on the Carla and highway-env simulators across multiple ADS (including Apollo, Interfuser, TF++, Highway Agent, Behavior Agent) show that OSG uncovers more accidents than the state-of-the-art BehAVExplor by about 92.97% on average and provides a principled ADS scoring scheme that reflects safety performance across risk levels. The framework supports objective, risk-aware comparisons of ADS versions and architectures, enabling safer, more reliable autonomous driving development. The work suggests future extensions to incorporate more dynamic traffic factors and broader real-world validation.

Abstract

The safety and reliability of Automated Driving Systems (ADS) are paramount, necessitating rigorous testing methodologies to uncover potential failures before deployment. Traditional testing approaches often prioritize either natural scenario sampling or safety-critical scenario generation, resulting in overly simplistic or unrealistic hazardous tests. In practice, the demand for natural scenarios (e.g., when evaluating the ADS's reliability in real-world conditions), critical scenarios (e.g., when evaluating safety in critical situations), or somewhere in between (e.g., when testing the ADS in regions with less civilized drivers) varies depending on the testing objectives. To address this issue, we propose the On-demand Scenario Generation (OSG) Framework, which generates diverse scenarios with varying risk levels. Achieving the goal of OSG is challenging due to the complexity of quantifying the criticalness and naturalness stemming from intricate vehicle-environment interactions, as well as the need to maintain scenario diversity across various risk levels. OSG learns from real-world traffic datasets and employs a Risk Intensity Regulator to quantitatively control the risk level. It also leverages an improved heuristic search method to ensure scenario diversity. We evaluate OSG on the Carla simulators using various ADSs. We verify OSG's ability to generate scenarios with different risk levels and demonstrate its necessity by comparing accident types across risk levels. With the help of OSG, we are now able to systematically and objectively compare the performance of different ADSs based on different risk levels.

On-Demand Scenario Generation for Testing Automated Driving Systems

TL;DR

The paper tackles the challenge of evaluating Automated Driving Systems (ADS) across driving scenarios that vary in naturalness and risk. It introduces On-demand Scenario Generation (OSG), a framework that learns from real traffic data using a Naturalness Estimator and a traffic-prior model, and controls scenario risk with a Risk Intensity Regulator; scenario generation is guided by a balance between adversarial risk and naturalness through the objective . OSG organizes scenarios hierarchically as functional, logical, and concrete (FS/LS/CS), and uses a Speciation-based Particle Swarm Optimization to ensure diverse results. Evaluations on the Carla and highway-env simulators across multiple ADS (including Apollo, Interfuser, TF++, Highway Agent, Behavior Agent) show that OSG uncovers more accidents than the state-of-the-art BehAVExplor by about 92.97% on average and provides a principled ADS scoring scheme that reflects safety performance across risk levels. The framework supports objective, risk-aware comparisons of ADS versions and architectures, enabling safer, more reliable autonomous driving development. The work suggests future extensions to incorporate more dynamic traffic factors and broader real-world validation.

Abstract

The safety and reliability of Automated Driving Systems (ADS) are paramount, necessitating rigorous testing methodologies to uncover potential failures before deployment. Traditional testing approaches often prioritize either natural scenario sampling or safety-critical scenario generation, resulting in overly simplistic or unrealistic hazardous tests. In practice, the demand for natural scenarios (e.g., when evaluating the ADS's reliability in real-world conditions), critical scenarios (e.g., when evaluating safety in critical situations), or somewhere in between (e.g., when testing the ADS in regions with less civilized drivers) varies depending on the testing objectives. To address this issue, we propose the On-demand Scenario Generation (OSG) Framework, which generates diverse scenarios with varying risk levels. Achieving the goal of OSG is challenging due to the complexity of quantifying the criticalness and naturalness stemming from intricate vehicle-environment interactions, as well as the need to maintain scenario diversity across various risk levels. OSG learns from real-world traffic datasets and employs a Risk Intensity Regulator to quantitatively control the risk level. It also leverages an improved heuristic search method to ensure scenario diversity. We evaluate OSG on the Carla simulators using various ADSs. We verify OSG's ability to generate scenarios with different risk levels and demonstrate its necessity by comparing accident types across risk levels. With the help of OSG, we are now able to systematically and objectively compare the performance of different ADSs based on different risk levels.

Paper Structure

This paper contains 17 sections, 10 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of scenarios at different risk levels. The green vehicle is the ADS under test (i.e., EGO), and the red vehicle is the background vehicle (i.e., NPC). Left scenario: The EGO and NPC are slowly moving in different lanes, with a far distance between them. Middle scenario: a slower NPC attempts to merge into the lane occupied by the EGO, forcing the EGO to yield. Right scenario: An adjacent NPC suddenly changes lanes into the EGO's lane, resulting in a collision.
  • Figure 2: Highlighted scenarios in the example test suite.
  • Figure 3: In the Carla simulator, the Interfuser agent did not detect a slow-moving NPC encroaching into the lane, resulting in a collision. The four screenshots, arranged from left to right, capture the Carla simulator's state at 5, 7, 9, and 11 seconds of simulation time. The last image highlights the wire-frame representation of the vehicles at the point of collision.
  • Figure 4: The overall framework of our method for generating traffic scenarios.
  • Figure 5: Three stages of a scenario. From left to right, they are the functional scenario, logical scenario, and concrete scenario.
  • ...and 4 more figures