Revisiting Gradient-based Uncertainty for Monocular Depth Estimation

Julia Hornauer; Amir El-Ghoussani; Vasileios Belagiannis

Revisiting Gradient-based Uncertainty for Monocular Depth Estimation

Julia Hornauer, Amir El-Ghoussani, Vasileios Belagiannis

TL;DR

This paper tackles the challenge of per-pixel uncertainty in monocular depth estimation by proposing a post hoc, training-free gradient-based approach. It introduces a reference depth $d_{ref}$ generated from augmented inputs via an invertible transform, and an auxiliary loss $\mathcal{L}_{aux}$ that drives backpropagation through the fixed depth estimator to produce gradient maps $g_i$ with respect to decoder features. Uncertainty is obtained as pixel-wise maps by processing these gradients, either from a single decoder layer or across multiple layers, with a normalised, layer-robust scoring scheme. The method achieves state-of-the-art uncertainty estimation on KITTI and NYU benchmarks for both convolutional and transformer-based models, without retraining, and code is publicly available; ablations validate the design choices and show robustness across architectures and augmentations.

Abstract

Monocular depth estimation, similar to other image-based tasks, is prone to erroneous predictions due to ambiguities in the image, for example, caused by dynamic objects or shadows. For this reason, pixel-wise uncertainty assessment is required for safety-critical applications to highlight the areas where the prediction is unreliable. We address this in a post hoc manner and introduce gradient-based uncertainty estimation for already trained depth estimation models. To extract gradients without depending on the ground truth depth, we introduce an auxiliary loss function based on the consistency of the predicted depth and a reference depth. The reference depth, which acts as pseudo ground truth, is in fact generated using a simple image or feature augmentation, making our approach simple and effective. To obtain the final uncertainty score, the derivatives w.r.t. the feature maps from single or multiple layers are calculated using back-propagation. We demonstrate that our gradient-based approach is effective in determining the uncertainty without re-training using the two standard depth estimation benchmarks KITTI and NYU. In particular, for models trained with monocular sequences and therefore most prone to uncertainty, our method outperforms related approaches. In addition, we publicly provide our code and models: https://github.com/jhornauer/GrUMoDepth

Revisiting Gradient-based Uncertainty for Monocular Depth Estimation

TL;DR

Abstract

Revisiting Gradient-based Uncertainty for Monocular Depth Estimation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)