Table of Contents
Fetching ...

Introducing a Class-Aware Metric for Monocular Depth Estimation: An Automotive Perspective

Tim Bader, Leon Eisemann, Adrian Pogorzelski, Namrata Jangid, Attila-Balazs Kis

TL;DR

This work addresses the need for interpretable, safety-oriented evaluation of monocular depth estimation in automotive settings by introducing a novel Class-Aware Multi-Component Depth Estimation metric. The metric combines a class-based component with intra- and inter-class weighting, a local feature component focused on edges and corners, and a global consistency component, aggregating to $L = \gamma(E_{\text{class}} + E_{\text{feature}} + E_{\text{global}})$ with $\gamma = 1$, where $E_{\text{class}}$, $E_{\text{feature}}$, and $E_{\text{global}}$ are defined via MAE-based formulations. Through evaluation on the GOOSE dataset and a wide set of SOTA models, the approach yields per-class insights (e.g., safety-critical traffic signs) and identifies challenging scenarios beyond what classical metrics reveal, enabling more informed model selection for safety-critical deployments. The work demonstrates the practical utility of its metric for safety assessments and proposes concrete avenues for future enhancements, including improved labeling, distribution awareness, and cross-dataset applicability.

Abstract

The increasing accuracy reports of metric monocular depth estimation models lead to a growing interest from the automotive domain. Current model evaluations do not provide deeper insights into the models' performance, also in relation to safety-critical or unseen classes. Within this paper, we present a novel approach for the evaluation of depth estimation models. Our proposed metric leverages three components, a class-wise component, an edge and corner image feature component, and a global consistency retaining component. Classes are further weighted on their distance in the scene and on criticality for automotive applications. In the evaluation, we present the benefits of our metric through comparison to classical metrics, class-wise analytics, and the retrieval of critical situations. The results show that our metric provides deeper insights into model results while fulfilling safety-critical requirements. We release the code and weights on the following repository: https://github.com/leisemann/ca_mmde

Introducing a Class-Aware Metric for Monocular Depth Estimation: An Automotive Perspective

TL;DR

This work addresses the need for interpretable, safety-oriented evaluation of monocular depth estimation in automotive settings by introducing a novel Class-Aware Multi-Component Depth Estimation metric. The metric combines a class-based component with intra- and inter-class weighting, a local feature component focused on edges and corners, and a global consistency component, aggregating to with , where , , and are defined via MAE-based formulations. Through evaluation on the GOOSE dataset and a wide set of SOTA models, the approach yields per-class insights (e.g., safety-critical traffic signs) and identifies challenging scenarios beyond what classical metrics reveal, enabling more informed model selection for safety-critical deployments. The work demonstrates the practical utility of its metric for safety assessments and proposes concrete avenues for future enhancements, including improved labeling, distribution awareness, and cross-dataset applicability.

Abstract

The increasing accuracy reports of metric monocular depth estimation models lead to a growing interest from the automotive domain. Current model evaluations do not provide deeper insights into the models' performance, also in relation to safety-critical or unseen classes. Within this paper, we present a novel approach for the evaluation of depth estimation models. Our proposed metric leverages three components, a class-wise component, an edge and corner image feature component, and a global consistency retaining component. Classes are further weighted on their distance in the scene and on criticality for automotive applications. In the evaluation, we present the benefits of our metric through comparison to classical metrics, class-wise analytics, and the retrieval of critical situations. The results show that our metric provides deeper insights into model results while fulfilling safety-critical requirements. We release the code and weights on the following repository: https://github.com/leisemann/ca_mmde
Paper Structure (25 sections, 9 equations, 3 figures, 3 tables)

This paper contains 25 sections, 9 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Class hierarchy to categorize datasets, consisting of high- and mid-level classes.
  • Figure 2: Distribution of the pre-defined classes that are present in the datasets used by the evaluation models. The percentages were calculated with Equation \ref{['eq:class_weighting']}.
  • Figure 3: Example use of our metric in identifying challenging scenes for depth estimation. A classical MAE evaluation shows 3.77 for Metric3D V2 and 6.00 for DepthAnything V2, missing factors needed in safety-critical use. In comparison our Metric yields 29.97 for Metric3D V2 and 28.98 for DepthAnything. Our metric hereby weights in missed objects (a), object distinction (b), and shape representation (a) & (c).