Table of Contents
Fetching ...

SUPER-AD: Semantic Uncertainty-aware Planning for End-to-End Robust Autonomous Driving

Wonjeong Ryu, Seungjun Yu, Seokha Moon, Hojun Choi, Junsung Park, Jinkyu Kim, Hyunjung Shim

TL;DR

This paper tackles the lack of uncertainty awareness in end-to-end autonomous driving by introducing a camera-only framework that models aleatoric uncertainty directly in Bird's-Eye View space. It produces a dense uncertainty-aware drivable score map and employs a lane-following regularization to stabilize plans while preserving maneuvers. Uncertainty is captured by sampling from the perceptual output logits to form a probabilistic drivable map that weights candidate trajectories, improving safety in uncertain or occluded regions. Evaluated on NAVSIM, the approach achieves state-of-the-art results, particularly on challenging and safety-critical NAVHARD and NAVSAFE subsets, underscoring the practical benefits of integrating perception uncertainty and driving priors into planning.

Abstract

End-to-End (E2E) planning has become a powerful paradigm for autonomous driving, yet current systems remain fundamentally uncertainty-blind. They assume perception outputs are fully reliable, even in ambiguous or poorly observed scenes, leaving the planner without an explicit measure of uncertainty. To address this limitation, we propose a camera-only E2E framework that estimates aleatoric uncertainty directly in BEV space and incorporates it into planning. Our method produces a dense, uncertainty-aware drivability map that captures both semantic structure and geometric layout at pixel-level resolution. To further promote safe and rule-compliant behavior, we introduce a lane-following regularization that encodes lane structure and traffic norms. This prior stabilizes trajectory planning under normal conditions while preserving the flexibility needed for maneuvers such as overtaking or lane changes. Together, these components enable robust and interpretable trajectory planning, even under challenging uncertainty conditions. Evaluated on the NAVSIM benchmark, our method achieves state-of-the-art performance, delivering substantial gains on both the challenging NAVHARD and NAVSAFE subsets. These results demonstrate that our principled aleatoric uncertainty modeling combined with driving priors significantly advances the safety and reliability of camera-only E2E autonomous driving.

SUPER-AD: Semantic Uncertainty-aware Planning for End-to-End Robust Autonomous Driving

TL;DR

This paper tackles the lack of uncertainty awareness in end-to-end autonomous driving by introducing a camera-only framework that models aleatoric uncertainty directly in Bird's-Eye View space. It produces a dense uncertainty-aware drivable score map and employs a lane-following regularization to stabilize plans while preserving maneuvers. Uncertainty is captured by sampling from the perceptual output logits to form a probabilistic drivable map that weights candidate trajectories, improving safety in uncertain or occluded regions. Evaluated on NAVSIM, the approach achieves state-of-the-art results, particularly on challenging and safety-critical NAVHARD and NAVSAFE subsets, underscoring the practical benefits of integrating perception uncertainty and driving priors into planning.

Abstract

End-to-End (E2E) planning has become a powerful paradigm for autonomous driving, yet current systems remain fundamentally uncertainty-blind. They assume perception outputs are fully reliable, even in ambiguous or poorly observed scenes, leaving the planner without an explicit measure of uncertainty. To address this limitation, we propose a camera-only E2E framework that estimates aleatoric uncertainty directly in BEV space and incorporates it into planning. Our method produces a dense, uncertainty-aware drivability map that captures both semantic structure and geometric layout at pixel-level resolution. To further promote safe and rule-compliant behavior, we introduce a lane-following regularization that encodes lane structure and traffic norms. This prior stabilizes trajectory planning under normal conditions while preserving the flexibility needed for maneuvers such as overtaking or lane changes. Together, these components enable robust and interpretable trajectory planning, even under challenging uncertainty conditions. Evaluated on the NAVSIM benchmark, our method achieves state-of-the-art performance, delivering substantial gains on both the challenging NAVHARD and NAVSAFE subsets. These results demonstrate that our principled aleatoric uncertainty modeling combined with driving priors significantly advances the safety and reliability of camera-only E2E autonomous driving.

Paper Structure

This paper contains 13 sections, 10 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Illustration of overconfidence‑induced misplanning. It shows that the model deviates from the correct path toward non-drivable regions, as confirmed by the drivable confidence score map where these areas receive undesirably high confidence, leading to overconfident misplanning. (a) shows representative examples of such failure cases, where the model is misled by overconfident predictions. (b) presents quantitative evidence capturing this general trend.
  • Figure 2: Overview of our framework at inference time. Our model first extracts Bird's-Eye View (BEV) features from multi-view images. The segmentation head then uses these features to predict a distribution composed of logits and uncertainty. In parallel, the planner based on DiffusionDrive diffusiondrive estimates candidate trajectories using the encoder outputs, and ego vehicle features. After estimate trajectories, predicted distribution is utilized to evaluate the candidates. This weighting mechanism prevents trajectories from being sampled by penalizing those that pass through high uncertainty areas.
  • Figure 3: Qualitative results on challenging scenes. Scene A depicts a diverging lane entry where occlusions and viewpoint rotations increase uncertainty. Scene B involves safe lane changes in the presence of multiple objects, highlighting risk-aware planning.
  • Figure 4: Qualitative results of our ablation study on uncertainty. The w/o Uncertainty employs the DiffusionDrive planning module with its perception module replaced by an camera-only BEV encoder. The w/ Uncertainty denotes our proposed method which incorporates uncertainty. The pink bounding boxes highlight how effectively the predicted trajectory leverages uncertainty.