Cognitive-Hierarchy Guided End-to-End Planning for Autonomous Driving
Zhennan Wang, Jianing Teng, Canqun Xiang, Kangliang Chen, Xing Pan, Lu Deng, Weihao Gu
TL;DR
CogAD addresses the gap between end-to-end autonomous driving and human cognitive processes by introducing a cognitively inspired, hierarchical framework. It combines hierarchical perception (global BEV context followed by instance-level refinement) with hierarchical planning (intent-driven high-level decisions followed by trajectory-level generation), underpinned by dual uncertainty modeling through online trajectory anchors and shared motion mode embeddings. Cross-task instance interactions and BEV adapters fuse scene-wide context with object-level details, enabling diverse yet plausible multi-modal trajectories. Empirically, CogAD achieves state-of-the-art results on nuScenes and Bench2Drive, with strong generalization to long-tail and complex real-world scenarios, while maintaining efficiency and not relying on ego state or history inputs.
Abstract
While end-to-end autonomous driving has advanced significantly, prevailing methods remain fundamentally misaligned with human cognitive principles in both perception and planning. In this paper, we propose CogAD, a novel end-to-end autonomous driving model that emulates the hierarchical cognition mechanisms of human drivers. CogAD implements dual hierarchical mechanisms: global-to-local context processing for human-like perception and intent-conditioned multi-mode trajectory generation for cognitively-inspired planning. The proposed method demonstrates three principal advantages: comprehensive environmental understanding through hierarchical perception, robust planning exploration enabled by multi-level planning, and diverse yet reasonable multi-modal trajectory generation facilitated by dual-level uncertainty modeling. Extensive experiments on nuScenes and Bench2Drive demonstrate that CogAD achieves state-of-the-art performance in end-to-end planning, exhibiting particular superiority in long-tail scenarios and robust generalization to complex real-world driving conditions.
