Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning

Siyi Lu; Lei He; Shengbo Eben Li; Yugong Luo; Jianqiang Wang; Keqiang Li

Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning

Siyi Lu, Lei He, Shengbo Eben Li, Yugong Luo, Jianqiang Wang, Keqiang Li

TL;DR

This paper proposes a novel DRL-based end-to-end driving framework that utilizes multi-sensor inputs to construct a unified three-dimensional understanding of the environment, and proposes a BEV-based system that extracts and translates critical environmental features into high-level abstract states for DRL, facilitating more informed control.

Abstract

End-to-end autonomous driving offers a streamlined alternative to the traditional modular pipeline, integrating perception, prediction, and planning within a single framework. While Deep Reinforcement Learning (DRL) has recently gained traction in this domain, existing approaches often overlook the critical connection between feature extraction of DRL and perception. In this paper, we bridge this gap by mapping the DRL feature extraction network directly to the perception phase, enabling clearer interpretation through semantic segmentation. By leveraging Bird's-Eye-View (BEV) representations, we propose a novel DRL-based end-to-end driving framework that utilizes multi-sensor inputs to construct a unified three-dimensional understanding of the environment. This BEV-based system extracts and translates critical environmental features into high-level abstract states for DRL, facilitating more informed control. Extensive experimental evaluations demonstrate that our approach not only enhances interpretability but also significantly outperforms state-of-the-art methods in autonomous driving control tasks, reducing the collision rate by 20%.

Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (16 sections, 2 equations, 4 figures, 2 tables)

This paper contains 16 sections, 2 equations, 4 figures, 2 tables.

Introduction
Related Work
Traditional Modular Approach
Deep Reinforcement Learning for Autonomous Driving
Explainability of Autonomous Driving
Approach
Problem Formulation
Deep Reinforcement Learning-Based Autonomous Driving
BEV Feature Extraction Network
Semantic Segmentation of Latent Feature
Experimental Results
Experimental Setup
Evaluation of autonomous driving in different maps
Evaluation of autonomous driving in high-congestion environments
Interpretability
...and 1 more sections

Figures (4)

Figure 1: Our perception-driven end-to-end autonomous driving model build on deep reinforcement learning, proposed a feature extraction network based on the bird's-eye view space to process the input surround camera images, output high-dimensional features to the reinforcement learning strategy network, and directly output control information for controlling the vehicle's throttle, brake, and steering wheel.
Figure 2: Neural network architecture of the proposed framework. On the left is the architecture of deep reinforcement learning, and on the right is the architecture of the BEV feature extraction network.
Figure 3: Change curve of the reward function of DRL and Ours-3 method during reinforcement learning training
Figure 4: The illustration of the Interpretability of our approach. Each sampling frame is randomly selected from the experiment. The six photos in each sampling frame are taken by a set of surround cameras. The picture on the right is the semantic segmentation result generated by these six pictures.

Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning

TL;DR

Abstract

Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)