Table of Contents
Fetching ...

A Flow-based Credibility Metric for Safety-critical Pedestrian Detection

Maria Lyssenko, Christoph Gladisch, Christian Heinzemann, Matthias Woehrle, Rudolph Triebel

TL;DR

The paper tackles safety challenges in automated driving perception by addressing safety agnostic evaluation and introducing c-flow, a flow based credibility metric for pedestrian bounding boxes that leverages optical flow from image sequences. c-flow combines motion cues with bounding box dynamics and uses a sliding window to produce a score that approximates 1 for true positives and 0 for false negatives, with an unsupervised extension using hypothesized bounding boxes. The approach is validated on a large AD dataset, using RetinaNet pretrained on nuImages and optical flow from RAFT, demonstrating strong discrimination between TP and FN and strong correlation to ground truth in an unsupervised setting (rho around 0.83) with approximately 97% FN agreement. The work provides a practical runtime observer for safety critical misdetections and offers dataset insights for safety auditing and potential active learning, with future work on refining hypothesized bounding boxes and integrating into runtime pipelines.

Abstract

Safety is of utmost importance for perception in automated driving (AD). However, a prime safety concern in state-of-the art object detection is that standard evaluation schemes utilize safety-agnostic metrics to argue sufficient detection performance. Hence, it is imperative to leverage supplementary domain knowledge to accentuate safety-critical misdetections during evaluation tasks. To tackle the underspecification, this paper introduces a novel credibility metric, called c-flow, for pedestrian bounding boxes. To this end, c-flow relies on a complementary optical flow signal from image sequences and enhances the analyses of safety-critical misdetections without requiring additional labels. We implement and evaluate c-flow with a state-of-the-art pedestrian detector on a large AD dataset. Our analysis demonstrates that c-flow allows developers to identify safety-critical misdetections.

A Flow-based Credibility Metric for Safety-critical Pedestrian Detection

TL;DR

The paper tackles safety challenges in automated driving perception by addressing safety agnostic evaluation and introducing c-flow, a flow based credibility metric for pedestrian bounding boxes that leverages optical flow from image sequences. c-flow combines motion cues with bounding box dynamics and uses a sliding window to produce a score that approximates 1 for true positives and 0 for false negatives, with an unsupervised extension using hypothesized bounding boxes. The approach is validated on a large AD dataset, using RetinaNet pretrained on nuImages and optical flow from RAFT, demonstrating strong discrimination between TP and FN and strong correlation to ground truth in an unsupervised setting (rho around 0.83) with approximately 97% FN agreement. The work provides a practical runtime observer for safety critical misdetections and offers dataset insights for safety auditing and potential active learning, with future work on refining hypothesized bounding boxes and integrating into runtime pipelines.

Abstract

Safety is of utmost importance for perception in automated driving (AD). However, a prime safety concern in state-of-the art object detection is that standard evaluation schemes utilize safety-agnostic metrics to argue sufficient detection performance. Hence, it is imperative to leverage supplementary domain knowledge to accentuate safety-critical misdetections during evaluation tasks. To tackle the underspecification, this paper introduces a novel credibility metric, called c-flow, for pedestrian bounding boxes. To this end, c-flow relies on a complementary optical flow signal from image sequences and enhances the analyses of safety-critical misdetections without requiring additional labels. We implement and evaluate c-flow with a state-of-the-art pedestrian detector on a large AD dataset. Our analysis demonstrates that c-flow allows developers to identify safety-critical misdetections.
Paper Structure (14 sections, 1 equation, 9 figures)

This paper contains 14 sections, 1 equation, 9 figures.

Figures (9)

  • Figure 1: We present a novel metric, called c-flow, to quantify the credibility of pedestrian bounding boxes. Therefore, we exploit temporal information from optical flow (bottom) and a series of pedestrian detections ($\mathfrak{B}_{\textit{pred,t}}$) to rate whether it is credible to have a misdetection at t=0, i.e. a false negative in our case. Thus, c-flow provides an supplementary signal that helps to uncover prospectively challenging, safety-critical misdetections (c-flow$\xrightarrow~0$). In the optical flow maps, the intense red color illustrates particularly high, relative motion that we leverage for the metric design.
  • Figure 2: Selected pedestrian track (red star sequence from Fig. \ref{['fig:crit']}) showing evolution over time (in color) for a pedestrian bounding $\mathfrak{B}_\textit{GT,t}$ illustrating (i) the classification outcome (top) and (ii) the median score $u$ gathered from the optical flow map at $\mathfrak{B}_\textit{GT,t}$. The gray window highlights sudden fluctuations in $u$ that correlate with the switch between classification outcomes at, e.g., TTC$\approx$2.25$s$.
  • Figure 3: (a) Leveraging linear regression to determine the variability in $u$ to construct c-flow. (b) For missing detections at $t_0$, we apply the methodology of hypothesized bounding boxes ($\mathfrak{B}_{hyp}$) to infer the required $\mathfrak{B}_{hyp}$. We extract the upper left corner (UL) of past predictions to extrapolate the hypothesized UL position at $t_0$.
  • Figure 4: Experimental setup for c-flow evaluation. We extend our reachability framework (RF) from LGH+22 by a DNN-based optical flow estimation using RAFT. We perform our evaluation on the basis of identified, potentially safety-critical pedestrian tracks (Track ID) with respect to TTC derived from the motion domain.
  • Figure 5: Extracted pedestrian tracks from Argoverse 1.1 using the reachability framework. Each data point represents an interaction defined by the most critical pedestrian misdetection w.r.t. TTC and distance. The dotted line separates critical (TTC$<2\,s$) vs. non-critical interactions. Misdetections with a poor detection quality (0$<$IoU$<$0.5) are highlighted in orange and FN (IoU=0) are marked in red.
  • ...and 4 more figures