Table of Contents
Fetching ...

Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

Jingyuan Zhou, Longhao Yan, Kaidi Yang

TL;DR

A safe DRL-based controller that can provide a system-level safety guarantee for mixed-autonomy platoon control is proposed and a learning-based system identification approach is devised to estimate the unknown human car-following behavior in the real system.

Abstract

Connected and automated vehicles (CAVs) have recently gained prominence in traffic research due to advances in communication technology and autonomous driving. Various longitudinal control strategies for CAVs have been developed to enhance traffic efficiency, stability, and safety in mixed-autonomy scenarios. Deep reinforcement learning (DRL) is one promising strategy for mixed-autonomy platoon control, thanks to its capability of managing complex scenarios in real time after sufficient offline training. However, there are three research gaps for DRL-based mixed-autonomy platoon control: (i) the lack of theoretical collision-free guarantees, (ii) the widely adopted but impractical assumption of skilled and rational drivers who will not collide with preceding vehicles, and (iii) the strong assumption of a known human driver model. To address these research gaps, we propose a safe DRL-based controller that can provide a system-level safety guarantee for mixed-autonomy platoon control. First, we combine control barrier function (CBF)-based safety constraints and DRL via a quadratic programming (QP)-based differentiable neural network layer to provide theoretical safety guarantees. Second, we incorporate system-level safety constraints into our proposed method to account for the safety of both CAVs and the following HDVs to address the potential collisions due to irrational human driving behavior. Third, we devise a learning-based system identification approach to estimate the unknown human car-following behavior in the real system. Simulation results demonstrate that our proposed method effectively ensures CAV safety and improves HDV safety in mixed platoon environments while simultaneously enhancing traffic capacity and string stability.

Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

TL;DR

A safe DRL-based controller that can provide a system-level safety guarantee for mixed-autonomy platoon control is proposed and a learning-based system identification approach is devised to estimate the unknown human car-following behavior in the real system.

Abstract

Connected and automated vehicles (CAVs) have recently gained prominence in traffic research due to advances in communication technology and autonomous driving. Various longitudinal control strategies for CAVs have been developed to enhance traffic efficiency, stability, and safety in mixed-autonomy scenarios. Deep reinforcement learning (DRL) is one promising strategy for mixed-autonomy platoon control, thanks to its capability of managing complex scenarios in real time after sufficient offline training. However, there are three research gaps for DRL-based mixed-autonomy platoon control: (i) the lack of theoretical collision-free guarantees, (ii) the widely adopted but impractical assumption of skilled and rational drivers who will not collide with preceding vehicles, and (iii) the strong assumption of a known human driver model. To address these research gaps, we propose a safe DRL-based controller that can provide a system-level safety guarantee for mixed-autonomy platoon control. First, we combine control barrier function (CBF)-based safety constraints and DRL via a quadratic programming (QP)-based differentiable neural network layer to provide theoretical safety guarantees. Second, we incorporate system-level safety constraints into our proposed method to account for the safety of both CAVs and the following HDVs to address the potential collisions due to irrational human driving behavior. Third, we devise a learning-based system identification approach to estimate the unknown human car-following behavior in the real system. Simulation results demonstrate that our proposed method effectively ensures CAV safety and improves HDV safety in mixed platoon environments while simultaneously enhancing traffic capacity and string stability.
Paper Structure (18 sections, 27 equations, 11 figures)

This paper contains 18 sections, 27 equations, 11 figures.

Figures (11)

  • Figure 1: Overview of the proposed controller for CAVs in mixed-autonomy traffic framework, the dotted line represents backpropagation of the differentiable QP.
  • Figure 2: Learning-based human driver behavior identification module.
  • Figure 3: Training rewards per episode.
  • Figure 4: Comparison between online learning-based system identification and RLS.
  • Figure 5: Safety-guaranteed regions associated with two specific scenarios: (a) the deceleration disturbance of the preceding vehicle and (b) the acceleration disturbance of the following vehicle. The horizontal axis denotes the magnitude of the disturbance signal (acceleration/deceleration), and the vertical axis is the duration of the disturbance signal. The dark blue regions denote the safety region of the LCC with PPO controller without the safety layer, and the light blue regions illustrate the expanded safety region achieved by implementing the safe RL controller for LCC (i.e., with safety layer), while the white regions indicate the unsafe region.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Definition 1: Control Barrier Function ames2014control
  • Remark 1