Embodied Cognition Augmented End2End Autonomous Driving
Ling Niu, Xiaoji Zheng, Han Wang, Chen Zheng, Ziyuan Yang, Bokui Chen, Jiangtao Gong
TL;DR
This work tackles the supervision gap in vision-based end-to-end autonomous driving by introducing E3AD, a brain-inspired paradigm that learns driving cognition through cross-modal contrastive learning between a visual driving encoder and the large EEG model LaBraM. A Driving-Thinking model is trained on a self-collected cognitive dataset and then frozen to augment mainstream end-to-end driving frameworks through three interaction schemes, yielding substantial gains in planning accuracy and safety metrics while maintaining efficient inference. Key contributions include the first integration of human driving cognition into end-to-end planning, ablation analyses validating the role of EEG-guided cognition, and demonstrations of improved performance on NuScenes and Bench2Drive, with plans to release the dataset and code. This approach lays groundwork for embodied, brain-inspired augmentation of autonomous driving systems, though it faces data-scale challenges and motivates further exploration of the underlying cognitive mechanisms.
Abstract
In recent years, vision-based end-to-end autonomous driving has emerged as a new paradigm. However, popular end-to-end approaches typically rely on visual feature extraction networks trained under label supervision. This limited supervision framework restricts the generality and applicability of driving models. In this paper, we propose a novel paradigm termed $E^{3}AD$, which advocates for comparative learning between visual feature extraction networks and the general EEG large model, in order to learn latent human driving cognition for enhancing end-to-end planning. In this work, we collected a cognitive dataset for the mentioned contrastive learning process. Subsequently, we investigated the methods and potential mechanisms for enhancing end-to-end planning with human driving cognition, using popular driving models as baselines on publicly available autonomous driving datasets. Both open-loop and closed-loop tests are conducted for a comprehensive evaluation of planning performance. Experimental results demonstrate that the $E^{3}AD$ paradigm significantly enhances the end-to-end planning performance of baseline models. Ablation studies further validate the contribution of driving cognition and the effectiveness of comparative learning process. To the best of our knowledge, this is the first work to integrate human driving cognition for improving end-to-end autonomous driving planning. It represents an initial attempt to incorporate embodied cognitive data into end-to-end autonomous driving, providing valuable insights for future brain-inspired autonomous driving systems. Our code will be made available at Github
