Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection

Qutub Syed; Michael Paulitsch; Korbinian Hagn; Neslihan Kose Cihangir; Kay-Ulrich Scholl; Fabian Oboril; Gereon Hinz; Alois Knoll

Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection

Qutub Syed, Michael Paulitsch, Korbinian Hagn, Neslihan Kose Cihangir, Kay-Ulrich Scholl, Fabian Oboril, Gereon Hinz, Alois Knoll

TL;DR

This work addresses zero-shot OOD detection for transformer-based object detectors in safety-critical settings by introducing Situation Monitor, a zero-shot OOD module built on a Diversity-based Budding Ensemble Architecture (DBEA) integrated with DINO-DETR. The core idea is to use tandem detectors with a diversity-driven loss to calibrate confidence and distinguish Far-OOD from Near-OOD through image-level uncertainty, $\,\mathcal{U}_{SM}$. Empirical results on KITTI, BDD100K, and COCO demonstrate improved OOD metrics and confidence calibration, while achieving about a 14% reduction in trainable parameters compared to the vanilla transformer model. The approach offers scalable, efficient improvement for reliable OOD handling in autonomous-driving-like perception tasks, with comprehensive ablations identifying effective settings for $\lambda_{div}$, $\lambda_{ta}$, and $\lambda_{tq}$.

Abstract

We introduce Situation Monitor, a novel zero-shot Out-of-Distribution (OOD) detection approach for transformer-based object detection models to enhance reliability in safety-critical machine learning applications such as autonomous driving. The Situation Monitor utilizes the Diversity-based Budding Ensemble Architecture (DBEA) and increases the OOD performance by integrating a diversity loss into the training process on top of the budding ensemble architecture, detecting Far-OOD samples and minimizing false positives on Near-OOD samples. Moreover, utilizing the resulting DBEA increases the model's OOD performance and improves the calibration of confidence scores, particularly concerning the intersection over union of the detected objects. The DBEA model achieves these advancements with a 14% reduction in trainable parameters compared to the vanilla model. This signifies a substantial improvement in efficiency without compromising the model's ability to detect OOD instances and calibrate the confidence scores accurately.

Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection

TL;DR

. Empirical results on KITTI, BDD100K, and COCO demonstrate improved OOD metrics and confidence calibration, while achieving about a 14% reduction in trainable parameters compared to the vanilla transformer model. The approach offers scalable, efficient improvement for reliable OOD handling in autonomous-driving-like perception tasks, with comprehensive ablations identifying effective settings for

, and

Abstract

Paper Structure (13 sections, 6 equations, 10 figures, 1 table)

This paper contains 13 sections, 6 equations, 10 figures, 1 table.

Introduction
Related Work
Problem Statement
Our Approach
DBEA: Diversity based Budding Ensemble Architecture
Situation Monitor
Evaluation Metrics for Situation Monitor
Experiments
Experiment Setup
Ablation Study on $\lambda_{div}$, $\lambda_{ta}$ and $\lambda_{tq}$ on KITTI trained DBEA-DINO-DETR
Benchmark Results
Overhead Analysis
Conclusion

Figures (10)

Figure 1: Out-of-Distribution definition, Dotted line represents the decision boundary of an OOD detection model that generalizes effectively.
Figure 2: The primary aim of the Situation Monitor is to distinguish between known and unknown situations. For instance, a model trained on datasets such as KITTI for automotive scenarios is categorized as a Near-Out-of-Distribution (Near-OOD) situation. Consequently, encountering an indoor scenario would be labelled as Far-Out-of-Distribution (Far-OOD) situation by the model.
Figure 3: Adaptation of BEA qutub2022bea to DINO-DETR zhang2022dino.
Figure 4: Within the BEA model, tandem detectors undergo further refinement with diversity loss, aiming to enhance the distinction between $\alpha$ and $\beta$ detectors. This leads to the development of Diversity-based BEA.
Figure 5: Adapted architecture diagram from DINO-DETR zhang2022dino for DBEA-DINO-DETR, illustrating the replication of final layers. For a comprehensive understanding of DINO-DETR, please refer to Figure 2 in zhang2022dino. In BEA, it is proposed to duplicate the final layers as $\alpha$ and $\beta$ to create tandem detectors, which are subsequently utilized in the situation monitoring model.
...and 5 more figures

Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection

TL;DR

Abstract

Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (10)