Table of Contents
Fetching ...

Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment

Avinaash Manoharan, Xiangyu Yin, Domenik Helm, Chih-Hong Cheng

TL;DR

The Cumulative Consensus Score (CCS), a label-free monitoring signal for continuous evaluation and comparison of detectors in real-world settings, is introduced, providing a robust foundation for DevOps-style monitoring of object detectors.

Abstract

Evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available. We introduce the Cumulative Consensus Score (CCS), a label-free monitoring signal for continuous evaluation and comparison of detectors in real-world settings. CCS applies test-time data augmentation to each image and measures the spatial consistency of predicted bounding boxes across augmented views using Intersection over Union. The resulting consensus score serves as a proxy for reliability without requiring bounding box annotations. In controlled experiments on Open Images and KITTI, CCS achieved over 90% congruence with F1-score, Probabilistic Detection Quality, and Optimal Correction Cost, with qualitative consistency further confirmed on COCO and BDD100K across model pairs. The method is model-agnostic, working across single-stage and two-stage detectors, and operates at the case level to highlight under-performing scenarios. We also provide a simplified theoretical link between expected CCS and detection correctness. Altogether, CCS provides a robust foundation for DevOps-style monitoring of object detectors.

Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment

TL;DR

The Cumulative Consensus Score (CCS), a label-free monitoring signal for continuous evaluation and comparison of detectors in real-world settings, is introduced, providing a robust foundation for DevOps-style monitoring of object detectors.

Abstract

Evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available. We introduce the Cumulative Consensus Score (CCS), a label-free monitoring signal for continuous evaluation and comparison of detectors in real-world settings. CCS applies test-time data augmentation to each image and measures the spatial consistency of predicted bounding boxes across augmented views using Intersection over Union. The resulting consensus score serves as a proxy for reliability without requiring bounding box annotations. In controlled experiments on Open Images and KITTI, CCS achieved over 90% congruence with F1-score, Probabilistic Detection Quality, and Optimal Correction Cost, with qualitative consistency further confirmed on COCO and BDD100K across model pairs. The method is model-agnostic, working across single-stage and two-stage detectors, and operates at the case level to highlight under-performing scenarios. We also provide a simplified theoretical link between expected CCS and detection correctness. Altogether, CCS provides a robust foundation for DevOps-style monitoring of object detectors.

Paper Structure

This paper contains 31 sections, 2 theorems, 25 equations, 4 figures, 7 tables.

Key Result

Lemma 1

Consider two object detectors $f_1$ and $f_2$ with augmentation-wise correctness $p_1$ and $p_2$, respectively, under the idealized single-object setting in Sec. sec:theory_correctness_ccs. Let $CCS_k$ be defined as in Eq. eq:ccs_given_i_incorrect--Eq. eq:expected_ccs_sum_form. For any margin $\Omeg if and only if

Figures (4)

  • Figure 1: Workflow of comparing the result of two object detectors $f_1$, $f_2$ using the corresponding CCS values.
  • Figure 2: Predictions from three augmentations shown in different coloured boxes and the associated CCS computation.
  • Figure 3: Scatter plots comparing CCS with established metrics: F1-Score, pPDQ, and OC-cost.
  • Figure 4: Sorted trend analysis showing the alignment of CCS with F1-Score, pPDQ, and OC-cost.

Theorems & Definitions (4)

  • Lemma 1
  • proof
  • Lemma 2: Monotonicity intuition
  • proof