Table of Contents
Fetching ...

Adaptive Testing Environment Generation for Connected and Automated Vehicles with Dense Reinforcement Learning

Jingxuan Yang, Ruoxuan Bai, Haoyuan Ji, Yi Zhang, Jianming Hu, Shuo Feng

TL;DR

This work tackles efficient safety evaluation of connected and automated vehicles when surrogate models do not fully capture real-world behavior. It introduces AdaTE, an adaptive testing environment that combines multiple surrogate models with coefficients learned by a regression framed as a quadratic program, while learning a regression target efficiently via dense reinforcement learning. The method emphasizes focusing on critical state-action pairs with a gap-based adaptive policy and proves convergence of DenseRL to the optimal target; it is validated in high-dimensional overtaking scenarios where AdaTE achieves comparable crash-rate estimates to NADE/NDE but with substantially fewer tests. The approach offers a practical pathway to robust, scalable evaluation of diverse CAVs, reducing experimental burden while maintaining accuracy. Future work includes extending from discretized to continuous state-action spaces and integrative adaptive evaluation of testing results.

Abstract

The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prior knowledge can significantly diminish the evaluation efficiency. In response to this issue, existing studies predominantly concentrate on the adaptive design of testing scenarios during the CAV testing process. Yet, these methods have limitations in their applicability to high-dimensional scenarios. To overcome this challenge, we develop an adaptive testing environment that bolsters evaluation robustness by incorporating multiple surrogate models and optimizing the combination coefficients of these surrogate models to enhance evaluation efficiency. We formulate the optimization problem as a regression task utilizing quadratic programming. To efficiently obtain the regression target via reinforcement learning, we propose the dense reinforcement learning method and devise a new adaptive policy with high sample efficiency. Essentially, our approach centers on learning the values of critical scenes displaying substantial surrogate-to-real gaps. The effectiveness of our method is validated in high-dimensional overtaking scenarios, demonstrating that our approach achieves notable evaluation efficiency.

Adaptive Testing Environment Generation for Connected and Automated Vehicles with Dense Reinforcement Learning

TL;DR

This work tackles efficient safety evaluation of connected and automated vehicles when surrogate models do not fully capture real-world behavior. It introduces AdaTE, an adaptive testing environment that combines multiple surrogate models with coefficients learned by a regression framed as a quadratic program, while learning a regression target efficiently via dense reinforcement learning. The method emphasizes focusing on critical state-action pairs with a gap-based adaptive policy and proves convergence of DenseRL to the optimal target; it is validated in high-dimensional overtaking scenarios where AdaTE achieves comparable crash-rate estimates to NADE/NDE but with substantially fewer tests. The approach offers a practical pathway to robust, scalable evaluation of diverse CAVs, reducing experimental burden while maintaining accuracy. Future work includes extending from discretized to continuous state-action spaces and integrative adaptive evaluation of testing results.

Abstract

The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prior knowledge can significantly diminish the evaluation efficiency. In response to this issue, existing studies predominantly concentrate on the adaptive design of testing scenarios during the CAV testing process. Yet, these methods have limitations in their applicability to high-dimensional scenarios. To overcome this challenge, we develop an adaptive testing environment that bolsters evaluation robustness by incorporating multiple surrogate models and optimizing the combination coefficients of these surrogate models to enhance evaluation efficiency. We formulate the optimization problem as a regression task utilizing quadratic programming. To efficiently obtain the regression target via reinforcement learning, we propose the dense reinforcement learning method and devise a new adaptive policy with high sample efficiency. Essentially, our approach centers on learning the values of critical scenes displaying substantial surrogate-to-real gaps. The effectiveness of our method is validated in high-dimensional overtaking scenarios, demonstrating that our approach achieves notable evaluation efficiency.
Paper Structure (15 sections, 2 theorems, 12 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 2 theorems, 12 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

Consider the stochastic process $(\nu_t,\Delta_t,F_t)$, $t\in\mathbb{N}_{\geqslant0}$, where $\nu_t$, $\Delta_t$, $F_t:\Omega\to\mathbb{R}$ satisfy $\Delta_{t+1}(\omega)=[1-\nu_t(\omega)]\Delta_t(\omega)+\nu_t(\omega)F_t(\omega)$, $\omega\in\Omega$. Let $\mathcal{F}_t$ be a sequence of increasing $\

Figures (8)

  • Figure 1: Illustration of the surrogate-to-real gaps.
  • Figure 2: Illustration of the adaptive testing environment generation method with multiple SMs.
  • Figure 3: Illustration of the dense reinforcement learning method.
  • Figure 4: Illustrations of the four phases of overtaking scenarios (a) and the passing phase (Phase II) of overtaking scenarios (b). In overtaking scenarios, the AV will overtake BV and LV. In the passing phase, the AV will pass BV and LV. While AV is passing, BV may overtake LV.
  • Figure 5: (a) The crash rate estimations of AV-I in NDE and the NADE where the importance function is constructed from SM-III. (b) The RHW of crash rate estimations.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Lemma 1
  • Theorem 1
  • Remark 1
  • Remark 2