Table of Contents
Fetching ...

Uncertainty estimates for semantic segmentation: providing enhanced reliability for automated motor claims handling

Jan Küchler, Daniel Kröll, Sebastian Schoenen, Andreas Witte

TL;DR

This work addresses the reliability of semantic segmentation for car-body-part damage assessment in automated motor claims by introducing a post-hoc meta-classification approach. It derives pixel-level uncertainty measures from softmax outputs and layer gradients, aggregates them to segment-level features, and trains a segment-quality classifier to distinguish high- vs low-quality segments without retraining the segmentation model. The best-performing model achieves an AUROC of $0.916 \pm 0.002$ and shows that segment quality correlates with precision ($\rho=0.74$) and IoU metrics ($\rho\ge 0.90$); by removing high-uncertainty segments, the mean IoU improves by $\Delta mIoU \approx 0.16$ on average, reducing false positives in downstream tasks. The proposed uncertainty-map-based mask-correction offers practical reliability gains for automated damage assessment in motor claims handling, with potential deployment as a lightweight post-processing step.

Abstract

Deep neural network models for image segmentation can be a powerful tool for the automation of motor claims handling processes in the insurance industry. A crucial aspect is the reliability of the model outputs when facing adverse conditions, such as low quality photos taken by claimants to document damages. We explore the use of a meta-classification model to empirically assess the precision of segments predicted by a model trained for the semantic segmentation of car body parts. Different sets of features correlated with the quality of a segment are compared, and an AUROC score of 0.915 is achieved for distinguishing between high- and low-quality segments. By removing low-quality segments, the average mIoU of the segmentation output is improved by 16 percentage points and the number of wrongly predicted segments is reduced by 77%.

Uncertainty estimates for semantic segmentation: providing enhanced reliability for automated motor claims handling

TL;DR

This work addresses the reliability of semantic segmentation for car-body-part damage assessment in automated motor claims by introducing a post-hoc meta-classification approach. It derives pixel-level uncertainty measures from softmax outputs and layer gradients, aggregates them to segment-level features, and trains a segment-quality classifier to distinguish high- vs low-quality segments without retraining the segmentation model. The best-performing model achieves an AUROC of and shows that segment quality correlates with precision () and IoU metrics (); by removing high-uncertainty segments, the mean IoU improves by on average, reducing false positives in downstream tasks. The proposed uncertainty-map-based mask-correction offers practical reliability gains for automated damage assessment in motor claims handling, with potential deployment as a lightweight post-processing step.

Abstract

Deep neural network models for image segmentation can be a powerful tool for the automation of motor claims handling processes in the insurance industry. A crucial aspect is the reliability of the model outputs when facing adverse conditions, such as low quality photos taken by claimants to document damages. We explore the use of a meta-classification model to empirically assess the precision of segments predicted by a model trained for the semantic segmentation of car body parts. Different sets of features correlated with the quality of a segment are compared, and an AUROC score of 0.915 is achieved for distinguishing between high- and low-quality segments. By removing low-quality segments, the average mIoU of the segmentation output is improved by 16 percentage points and the number of wrongly predicted segments is reduced by 77%.
Paper Structure (4 sections, 3 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 4 sections, 3 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Photograph of a car (left), taken to highlight issues that can negatively affect the performance of DNN segmentation models: reflections, dirt and bad exposure. Result of our semantic segmentation model (right), trained to segment car body parts. Predicted segments are shown as colored overlays. A few mistakes in the prediction are highlighted by red boxes: a reflection on the door is segmented as a molding part, and a part of the rear left rim is identified as an air intake.
  • Figure 2: Schematic diagram of the explored method. An input image is processed by a semantic segmentation model and the resulting segmentation mask and softmax probabilities are aggregated to segment wise features. These are processed by a meta-classification model in order to produce a segment-wise uncertainty map. Finally, the segment uncertainties are used to correct the segmentation mask.
  • Figure 3: Qualitative heat-maps of $1-\hat{p}_i$ (top left), $1 - D_i$ (top right), the entropy $E_i$ (bottom left) and the gradient uncertainty $G_i$ (bottom right), for the example image shown in \ref{['fig:example-car']}. Darker shades indicate higher pixel-wise uncertainties.
  • Figure 4: Sketch of a segmentation result and the quality metrics for one of the segments. (\ref{['fig:segment-sketch-main']}) A ground truth segment of class A (black dashed rectangle) is covered by three predicted segments: two of class A (blue), divided by a segment of a different class B (red). The correctly segmented area is indicated by the two blue shaded rectangles. (\ref{['fig:segment-sketch-metrics']}) The $I \mkern-2mu o \mkern1mu U\mkern-1mu$ of the left-most predicted segment is small, as it is calculated by dividing the blue shaded area by the intersection of the ground truth and the predicted segment, respectively. In contrast, for the $I \mkern-2mu o \mkern1mu U\mkern-1mu\xspace_{\mathrm{adj.}}$ the area covered by the other segment of class A is disregarded. For the precision, $p$, the correctly predicted area is compared only to the full predicted segment.
  • Figure 5: Distribution of the segment-wise precision. Segments with $p>0.5$ are selected as correct predictions. The population of segments at very low precision consists mostly of small, wrongly predicted segments.
  • ...and 5 more figures