Table of Contents
Fetching ...

Pixel-wise Gradient Uncertainty for Convolutional Neural Networks applied to Out-of-Distribution Segmentation

Kira Maag, Tobias Riedlinger

TL;DR

This work tackles the problem of detecting and segmenting out-of-distribution objects in semantic segmentation by introducing Pixel-wise Gradient Uncertainty (PGN). The core idea is to compute pixel-level uncertainty scores from the magnitude of loss gradients with respect to the final convolutional layer, using auxiliary labels that can be the predicted one-hot vector or a uniform distribution, defined as $\|\nabla_K \mathcal{L}_{ab}\|_p$ for each pixel $(a,b)$. The method is efficient, requiring only a single backpropagation step per inference to obtain $\text{PGN}_{\text{oh}}$ and $\text{PGN}_{\text{uni}}$, with exact gradient expressions derived for both the last layer and, optionally, deeper layers via patch unfolding and $L^p$-norm factorization. Extensive experiments on Cityscapes and four OoD benchmarks (LostAndFound, Fishyscapes, RoadAnomaly21, RoadObstacle21) demonstrate competitive pixel- and segment-level uncertainty metrics (ECE, AuSE, AuROC, $R^2$) and strong OoD segmentation performance, often outperforming MC Dropout and other baselines while incurring only ~1% extra runtime. The work also provides extensive ablations over $p$-norm values and deeper-layer gradients, highlighting the practical usefulness and robustness of gradient-based uncertainty for real-time, open-world segmentation tasks.

Abstract

In recent years, deep neural networks have defined the state-of-the-art in semantic segmentation where their predictions are constrained to a predefined set of semantic classes. They are to be deployed in applications such as automated driving, although their categorically confined expressive power runs contrary to such open world scenarios. Thus, the detection and segmentation of objects from outside their predefined semantic space, i.e., out-of-distribution (OoD) objects, is of highest interest. Since uncertainty estimation methods like softmax entropy or Bayesian models are sensitive to erroneous predictions, these methods are a natural baseline for OoD detection. Here, we present a method for obtaining uncertainty scores from pixel-wise loss gradients which can be computed efficiently during inference. Our approach is simple to implement for a large class of models, does not require any additional training or auxiliary data and can be readily used on pre-trained segmentation models. Our experiments show the ability of our method to identify wrong pixel classifications and to estimate prediction quality at negligible computational overhead. In particular, we observe superior performance in terms of OoD segmentation to comparable baselines on the SegmentMeIfYouCan benchmark, clearly outperforming other methods.

Pixel-wise Gradient Uncertainty for Convolutional Neural Networks applied to Out-of-Distribution Segmentation

TL;DR

This work tackles the problem of detecting and segmenting out-of-distribution objects in semantic segmentation by introducing Pixel-wise Gradient Uncertainty (PGN). The core idea is to compute pixel-level uncertainty scores from the magnitude of loss gradients with respect to the final convolutional layer, using auxiliary labels that can be the predicted one-hot vector or a uniform distribution, defined as for each pixel . The method is efficient, requiring only a single backpropagation step per inference to obtain and , with exact gradient expressions derived for both the last layer and, optionally, deeper layers via patch unfolding and -norm factorization. Extensive experiments on Cityscapes and four OoD benchmarks (LostAndFound, Fishyscapes, RoadAnomaly21, RoadObstacle21) demonstrate competitive pixel- and segment-level uncertainty metrics (ECE, AuSE, AuROC, ) and strong OoD segmentation performance, often outperforming MC Dropout and other baselines while incurring only ~1% extra runtime. The work also provides extensive ablations over -norm values and deeper-layer gradients, highlighting the practical usefulness and robustness of gradient-based uncertainty for real-time, open-world segmentation tasks.

Abstract

In recent years, deep neural networks have defined the state-of-the-art in semantic segmentation where their predictions are constrained to a predefined set of semantic classes. They are to be deployed in applications such as automated driving, although their categorically confined expressive power runs contrary to such open world scenarios. Thus, the detection and segmentation of objects from outside their predefined semantic space, i.e., out-of-distribution (OoD) objects, is of highest interest. Since uncertainty estimation methods like softmax entropy or Bayesian models are sensitive to erroneous predictions, these methods are a natural baseline for OoD detection. Here, we present a method for obtaining uncertainty scores from pixel-wise loss gradients which can be computed efficiently during inference. Our approach is simple to implement for a large class of models, does not require any additional training or auxiliary data and can be readily used on pre-trained segmentation models. Our experiments show the ability of our method to identify wrong pixel classifications and to estimate prediction quality at negligible computational overhead. In particular, we observe superior performance in terms of OoD segmentation to comparable baselines on the SegmentMeIfYouCan benchmark, clearly outperforming other methods.
Paper Structure (31 sections, 23 equations, 6 figures, 11 tables)

This paper contains 31 sections, 23 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: Top: Semantic segmentation by a deep neural network. Bottom: Gradient uncertainty heatmap obtained by our method.
  • Figure 2: Schematic illustration of the computation of pixel-wise gradient norms for a semantic segmentation network with a final convolution layer. Auxiliary labels may be derived from the softmax prediction or supplied in any other way (e.g., as a uniform all-warm label). We circumvent direct back propagation per pixel by utilizing eqs. \ref{['eq: explicit gradient computation']}, \ref{['eq: uniform gradients']}.
  • Figure 3: Segment-wise uncertainty evaluation results for both backbone architectures and the Cityscapes dataset in terms of classification $\text{AuROC}$ and regression $R^2$ values. From left to right: ensemble, MC Dropout, maximum softmax, mean entropy, MetaSeg (MS) approach, gradient features obtained by predictive one-hot and uniform labels (PGN), MetaSeg in combination with PGN.
  • Figure 4: Semantic segmentation prediction and $\mathrm{PGN}_{\mathit{uni}}^{p=0.5}$ heatmap for the Cityscapes dataset (left) and the RoadAnomaly21 dataset (right) for the WideResNet backbone.
  • Figure 5: $\text{AuROC}$ and $R^2$ values for both backbone architectures applied to the Cityscapes dataset and the different $p$-norms.
  • ...and 1 more figures