Table of Contents
Fetching ...

Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function

Linlin Yu, Bowen Yang, Tianhao Wang, Kangshuo Li, Feng Chen

TL;DR

This paper addresses uncertainty quantification in Bird's Eye View semantic segmentation, introducing the first BEVSS uncertainty benchmark and proposing Uncertainty Focal Cross Entropy (UFCE) along with an uncertainty-quantification framework that combines Epistemic Uncertainty Scaling (EUS) and Evidence Regularized Learning (ER). By leveraging evidential deep learning with Dirichlet distributions, the approach captures both aleatoric and epistemic uncertainty and improves calibration and OOD detection, especially when pseudo-OOD data are used during training. Extensive experiments across CARLA, nuScenes, and Lyft with three BEV backbones show UFCE-EUS-ER consistently achieving top performance in OOD detection (AUPR/AUROC) and calibration, while maintaining segmentation accuracy. The findings highlight the limitations of existing UQ methods in BEVSS and demonstrate that targeted loss design and regularization can substantially enhance the reliability of BEV-based perception in autonomous systems.

Abstract

The fusion of raw sensor data to create a Bird's Eye View (BEV) representation is critical for autonomous vehicle planning and control. Despite the growing interest in using deep learning models for BEV semantic segmentation, anticipating segmentation errors and enhancing the explainability of these models remain underexplored. This paper introduces a comprehensive benchmark for predictive uncertainty quantification in BEV segmentation, evaluating multiple uncertainty quantification methods across three popular datasets with three representative network architectures. Our study focuses on the effectiveness of quantified uncertainty in detecting misclassified and out-of-distribution (OOD) pixels while also improving model calibration. Through empirical analysis, we uncover challenges in existing uncertainty quantification methods and demonstrate the potential of evidential deep learning techniques, which capture both aleatoric and epistemic uncertainty. To address these challenges, we propose a novel loss function, Uncertainty-Focal-Cross-Entropy (UFCE), specifically designed for highly imbalanced data, along with a simple uncertainty-scaling regularization term that improves both uncertainty quantification and model calibration for BEV segmentation.

Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function

TL;DR

This paper addresses uncertainty quantification in Bird's Eye View semantic segmentation, introducing the first BEVSS uncertainty benchmark and proposing Uncertainty Focal Cross Entropy (UFCE) along with an uncertainty-quantification framework that combines Epistemic Uncertainty Scaling (EUS) and Evidence Regularized Learning (ER). By leveraging evidential deep learning with Dirichlet distributions, the approach captures both aleatoric and epistemic uncertainty and improves calibration and OOD detection, especially when pseudo-OOD data are used during training. Extensive experiments across CARLA, nuScenes, and Lyft with three BEV backbones show UFCE-EUS-ER consistently achieving top performance in OOD detection (AUPR/AUROC) and calibration, while maintaining segmentation accuracy. The findings highlight the limitations of existing UQ methods in BEVSS and demonstrate that targeted loss design and regularization can substantially enhance the reliability of BEV-based perception in autonomous systems.

Abstract

The fusion of raw sensor data to create a Bird's Eye View (BEV) representation is critical for autonomous vehicle planning and control. Despite the growing interest in using deep learning models for BEV semantic segmentation, anticipating segmentation errors and enhancing the explainability of these models remain underexplored. This paper introduces a comprehensive benchmark for predictive uncertainty quantification in BEV segmentation, evaluating multiple uncertainty quantification methods across three popular datasets with three representative network architectures. Our study focuses on the effectiveness of quantified uncertainty in detecting misclassified and out-of-distribution (OOD) pixels while also improving model calibration. Through empirical analysis, we uncover challenges in existing uncertainty quantification methods and demonstrate the potential of evidential deep learning techniques, which capture both aleatoric and epistemic uncertainty. To address these challenges, we propose a novel loss function, Uncertainty-Focal-Cross-Entropy (UFCE), specifically designed for highly imbalanced data, along with a simple uncertainty-scaling regularization term that improves both uncertainty quantification and model calibration for BEV segmentation.
Paper Structure (29 sections, 7 theorems, 58 equations, 6 figures, 24 tables)

This paper contains 29 sections, 7 theorems, 58 equations, 6 figures, 24 tables.

Key Result

Proposition 1

Given a predicted distribution $\mathbf{p} \sim \mathtt{Dir}(\boldsymbol{\alpha})$, where $\boldsymbol{\alpha} = (\alpha_1, \alpha_2, \dots, \alpha_C)$ and $C$ is the number of categories, and a target distribution $\boldsymbol{q} \sim \mathtt{Dir}(\hat{\boldsymbol{\alpha}})$, assuming a one-hot sty

Figures (6)

  • Figure 1: The y-axis represents the difference between the L1-norms of the UFCE and UCE gradients with $\alpha_{c^*} = 5$, while the x-axis corresponds to $\bar{p}_{c^*}$, the expected predicted probability of the ground truth class.
  • Figure 2: Numerical analysis of $\left\| \frac{\partial \mathcal{L}^{\text{Ufce}}}{\partial {\bf w}_{c^*}} \right\| - \left\| \frac{\partial \mathcal{L}^{\text{Uce}}}{\partial {\bf w}_{c^*}} \right\|$ for different composition of $\alpha_{c^*}$ and $\gamma$
  • Figure 3: Implicit weight regularization impact by UFCE with $\alpha_{c^*}=10$
  • Figure 4: Comparison of Semantic Segmentation Performance: Each row represents an example, with the first column showing the ground truth labels, where the yellow regions indicate the positive class ("vehicle" in these examples). We visualize the predicted probabilities for the positive class generated by our model and four baselines. Brighter regions correspond to higher probability values.
  • Figure 5: Comparison of Predicted Aleatoric Uncertainty for Misclassification Detection: Each row represents an example, with each pair of columns corresponding to one model. The left column shows the misclassified labels, where yellow indicates misclassified pixels, while the right column visualizes the predicted aleatoric uncertainty for the same model, with brighter regions representing higher uncertainty values.
  • ...and 1 more figures

Theorems & Definitions (12)

  • Proposition 1
  • Proposition 2
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Proposition 4
  • proof
  • Proposition 5
  • proof
  • ...and 2 more