Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

Jingyuan Zhou; Longhao Yan; Kaidi Yang

Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

Jingyuan Zhou, Longhao Yan, Kaidi Yang

TL;DR

A safe DRL-based controller that can provide a system-level safety guarantee for mixed-autonomy platoon control is proposed and a learning-based system identification approach is devised to estimate the unknown human car-following behavior in the real system.

Abstract

Connected and automated vehicles (CAVs) have recently gained prominence in traffic research due to advances in communication technology and autonomous driving. Various longitudinal control strategies for CAVs have been developed to enhance traffic efficiency, stability, and safety in mixed-autonomy scenarios. Deep reinforcement learning (DRL) is one promising strategy for mixed-autonomy platoon control, thanks to its capability of managing complex scenarios in real time after sufficient offline training. However, there are three research gaps for DRL-based mixed-autonomy platoon control: (i) the lack of theoretical collision-free guarantees, (ii) the widely adopted but impractical assumption of skilled and rational drivers who will not collide with preceding vehicles, and (iii) the strong assumption of a known human driver model. To address these research gaps, we propose a safe DRL-based controller that can provide a system-level safety guarantee for mixed-autonomy platoon control. First, we combine control barrier function (CBF)-based safety constraints and DRL via a quadratic programming (QP)-based differentiable neural network layer to provide theoretical safety guarantees. Second, we incorporate system-level safety constraints into our proposed method to account for the safety of both CAVs and the following HDVs to address the potential collisions due to irrational human driving behavior. Third, we devise a learning-based system identification approach to estimate the unknown human car-following behavior in the real system. Simulation results demonstrate that our proposed method effectively ensures CAV safety and improves HDV safety in mixed platoon environments while simultaneously enhancing traffic capacity and string stability.

Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

TL;DR

Abstract

Paper Structure (18 sections, 27 equations, 11 figures)

This paper contains 18 sections, 27 equations, 11 figures.

Introduction
Preliminaries
Reinforcement Learning (RL)
Control Barrier Functions
Mixed-Autonomy Traffic Environment Modeling
Safety-Critical Learning-based Control for Mixed-Autonomy Platoons
RL-based Controller
Learning-Based Human Driver Behavior Identification
Differentiable Safety Module
CBF-QP-Based Approach to Incorporate Safety Guarantees
Differentiable QP for Neural Networks
Simulation Results
Training Settings and Results
Testing Results
Safety-Guaranteed Region Analysis
...and 3 more sections

Figures (11)

Figure 1: Overview of the proposed controller for CAVs in mixed-autonomy traffic framework, the dotted line represents backpropagation of the differentiable QP.
Figure 2: Learning-based human driver behavior identification module.
Figure 3: Training rewards per episode.
Figure 4: Comparison between online learning-based system identification and RLS.
Figure 5: Safety-guaranteed regions associated with two specific scenarios: (a) the deceleration disturbance of the preceding vehicle and (b) the acceleration disturbance of the following vehicle. The horizontal axis denotes the magnitude of the disturbance signal (acceleration/deceleration), and the vertical axis is the duration of the disturbance signal. The dark blue regions denote the safety region of the LCC with PPO controller without the safety layer, and the light blue regions illustrate the expanded safety region achieved by implementing the safe RL controller for LCC (i.e., with safety layer), while the white regions indicate the unsafe region.
...and 6 more figures

Theorems & Definitions (2)

Definition 1: Control Barrier Function ames2014control
Remark 1

Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

TL;DR

Abstract

Enhancing System-Level Safety in Mixed-Autonomy Platoon via Safe Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (2)