Unified End-to-End V2X Cooperative Autonomous Driving
Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping Liu
TL;DR
This work tackles the gap that end-to-end V2X driving has often prioritized perception accuracy over accident prediction. It introduces UniE2EV2X, a unified end-to-end framework that fuses vehicle and infrastructure data through a Temporal BEV-based cooperative feature encoder and an integrated perception, motion, and accident-prediction module, all powered by deformable attention. Key contributions include (i) a V2X cooperative feature encoding pipeline with spatial BEV mapping and temporal cascading fusion, (ii) a unified end-to-end architecture for detection, tracking, motion prediction, and accident prediction, and (iii) validation on the DeepAccident dataset showing improved accident prediction and end-to-end perception performance. The approach holds practical significance for safer autonomous driving by leveraging V2X cooperation to enhance situational awareness and predictive safety assessments in complex traffic scenarios.
Abstract
V2X cooperation, through the integration of sensor data from both vehicles and infrastructure, is considered a pivotal approach to advancing autonomous driving technology. Current research primarily focuses on enhancing perception accuracy, often overlooking the systematic improvement of accident prediction accuracy through end-to-end learning, leading to insufficient attention to the safety issues of autonomous driving. To address this challenge, this paper introduces the UniE2EV2X framework, a V2X-integrated end-to-end autonomous driving system that consolidates key driving modules within a unified network. The framework employs a deformable attention-based data fusion strategy, effectively facilitating cooperation between vehicles and infrastructure. The main advantages include: 1) significantly enhancing agents' perception and motion prediction capabilities, thereby improving the accuracy of accident predictions; 2) ensuring high reliability in the data fusion process; 3) superior end-to-end perception compared to modular approaches. Furthermore, We implement the UniE2EV2X framework on the challenging DeepAccident, a simulation dataset designed for V2X cooperative driving.
