Table of Contents
Fetching ...

Calibrating the Full Predictive Class Distribution of 3D Object Detectors for Autonomous Driving

Cornelius Schröder, Marius-Raphael Schlüter, Markus Lienkamp

TL;DR

This work addresses calibration of the full predictive class distribution for 3D object detectors in autonomous driving, arguing that planning decisions depend on the entire confidence vector rather than just the top class. It introduces the Full D-ECE metric to capture cross-class calibration under strong calibration and proposes two train-time auxiliary losses, $\mathcal{L}_{\text{DECE}}$ and $\mathcal{L}_{\text{FullDECE}}$, alongside post-hoc methods. Through experiments on CenterPoint, PillarNet, and DSVT-Pillar with Waymo Open data, it shows detector-specific gains: CenterPoint and PillarNet benefit from combining $\mathcal{L}_{\text{FullDECE}}$ with Isotonic Regression fitted on the full predictions, while DSVT-Pillar responds best to Adaptive Focal Loss with Isotonic Regression. The findings highlight that no single calibration recipe fits all architectures and underscore the importance of calibration-aware evaluation to enable safer, planning-aware autonomous systems. $L_{\text{DECE}}$ and $L_{\text{FullDECE}}$, along with $D$-ECE and Full $D$-ECE, provide practical training-time and assessment tools to align predicted class probabilities with empirical frequencies across all classes.

Abstract

In autonomous systems, precise object detection and uncertainty estimation are critical for self-aware and safe operation. This work addresses confidence calibration for the classification task of 3D object detectors. We argue that it is necessary to regard the calibration of the full predictive confidence distribution over all classes and deduce a metric which captures the calibration of dominant and secondary class predictions. We propose two auxiliary regularizing loss terms which introduce either calibration of the dominant prediction or the full prediction vector as a training goal. We evaluate a range of post-hoc and train-time methods for CenterPoint, PillarNet and DSVT-Pillar and find that combining our loss term, which regularizes for calibration of the full class prediction, and isotonic regression lead to the best calibration of CenterPoint and PillarNet with respect to both dominant and secondary class predictions. We further find that DSVT-Pillar can not be jointly calibrated for dominant and secondary predictions using the same method.

Calibrating the Full Predictive Class Distribution of 3D Object Detectors for Autonomous Driving

TL;DR

This work addresses calibration of the full predictive class distribution for 3D object detectors in autonomous driving, arguing that planning decisions depend on the entire confidence vector rather than just the top class. It introduces the Full D-ECE metric to capture cross-class calibration under strong calibration and proposes two train-time auxiliary losses, and , alongside post-hoc methods. Through experiments on CenterPoint, PillarNet, and DSVT-Pillar with Waymo Open data, it shows detector-specific gains: CenterPoint and PillarNet benefit from combining with Isotonic Regression fitted on the full predictions, while DSVT-Pillar responds best to Adaptive Focal Loss with Isotonic Regression. The findings highlight that no single calibration recipe fits all architectures and underscore the importance of calibration-aware evaluation to enable safer, planning-aware autonomous systems. and , along with -ECE and Full -ECE, provide practical training-time and assessment tools to align predicted class probabilities with empirical frequencies across all classes.

Abstract

In autonomous systems, precise object detection and uncertainty estimation are critical for self-aware and safe operation. This work addresses confidence calibration for the classification task of 3D object detectors. We argue that it is necessary to regard the calibration of the full predictive confidence distribution over all classes and deduce a metric which captures the calibration of dominant and secondary class predictions. We propose two auxiliary regularizing loss terms which introduce either calibration of the dominant prediction or the full prediction vector as a training goal. We evaluate a range of post-hoc and train-time methods for CenterPoint, PillarNet and DSVT-Pillar and find that combining our loss term, which regularizes for calibration of the full class prediction, and isotonic regression lead to the best calibration of CenterPoint and PillarNet with respect to both dominant and secondary class predictions. We further find that DSVT-Pillar can not be jointly calibrated for dominant and secondary predictions using the same method.

Paper Structure

This paper contains 13 sections, 8 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: When evaluating the miscalibration of the baseline detectors for dominant predictions only (a), CenterPoint and PointPillar are better calibrated than DSVT-Pillar. If the full vector of confidence predictions is regarded (b), DSVT-Pillar becomes less overconfident for low to mid confidence levels while the other detectors increase their overconfidence.
  • Figure 2: We are able to simultaneously calibrate CenterPoint (a) and PillarNet (b) effectively for dominant predictions only and the full confidence vector using a combination of our training loss $\mathcal{L}_{\text{FullDECE}}$ and Isotonic Regression fitted on all predictions. Because simultaneous calibration of dominant predictions and full confidence vector does not succeed in case of DSVT-Pillar (c), we fit Isotonic Regression for calibration of dominant predictions and the full confidence vector only on dominant or all predictions.