Table of Contents
Fetching ...

From COCO to COCO-FP: A Deep Dive into Background False Positives for COCO Detectors

Longfei Liu, Wen Guo, Shihua Huang, Cheng Li, Xi Shen

TL;DR

By extending the original COCO validation dataset, COCO-FP specifically assesses object detectors' performance in mitigating background false positives, and shows a significant number of false positives in both standard and advanced object detectors.

Abstract

Reducing false positives is essential for enhancing object detector performance, as reflected in the mean Average Precision (mAP) metric. Although object detectors have achieved notable improvements and high mAP scores on the COCO dataset, analysis reveals limited progress in addressing false positives caused by non-target visual clutter-background objects not included in the annotated categories. This issue is particularly critical in real-world applications, such as fire and smoke detection, where minimizing false alarms is crucial. In this study, we introduce COCO-FP, a new evaluation dataset derived from the ImageNet-1K dataset, designed to address this issue. By extending the original COCO validation dataset, COCO-FP specifically assesses object detectors' performance in mitigating background false positives. Our evaluation of both standard and advanced object detectors shows a significant number of false positives in both closed-set and open-set scenarios. For example, the AP50 metric for YOLOv9-E decreases from 72.8 to 65.7 when shifting from COCO to COCO-FP. The dataset is available at https://github.com/COCO-FP/COCO-FP.

From COCO to COCO-FP: A Deep Dive into Background False Positives for COCO Detectors

TL;DR

By extending the original COCO validation dataset, COCO-FP specifically assesses object detectors' performance in mitigating background false positives, and shows a significant number of false positives in both standard and advanced object detectors.

Abstract

Reducing false positives is essential for enhancing object detector performance, as reflected in the mean Average Precision (mAP) metric. Although object detectors have achieved notable improvements and high mAP scores on the COCO dataset, analysis reveals limited progress in addressing false positives caused by non-target visual clutter-background objects not included in the annotated categories. This issue is particularly critical in real-world applications, such as fire and smoke detection, where minimizing false alarms is crucial. In this study, we introduce COCO-FP, a new evaluation dataset derived from the ImageNet-1K dataset, designed to address this issue. By extending the original COCO validation dataset, COCO-FP specifically assesses object detectors' performance in mitigating background false positives. Our evaluation of both standard and advanced object detectors shows a significant number of false positives in both closed-set and open-set scenarios. For example, the AP50 metric for YOLOv9-E decreases from 72.8 to 65.7 when shifting from COCO to COCO-FP. The dataset is available at https://github.com/COCO-FP/COCO-FP.
Paper Structure (17 sections, 8 figures, 2 tables)

This paper contains 17 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Summary of errors for object detectors on the COCO lin2014microsoft dataset.$\Delta\text{AP}$ illustrates the absolute contribution of each error type, computed from TIDE tide-eccv2020. We apply error analysis to Mask RCNN He_2017_ICCV, RTMDet lyu2022rtmdet, and YOLOv9 wang2024yolov9, with their mAP scores reaching 40.1, 52.8, and 55.6, respectively, on the COCO Val dataset.
  • Figure 2: Qualitative and quantitative analysis of false positives for object detectors.(a) False positive examples of YOLOv9 wang2024yolov9 on our COCO-FP dataset. Note that the confidence scores of the detected bounding boxes are important. (b) The performance of closed-set detectors. It's noteworthy that transitioning to COCO-FP results in a significant decrease in performance. (c) The performance of open-set detectors. The term "w. FP category" refers to leveraging the names of extra image categories as an additional text prompt for detection purposes. YOLO-World-X Cheng2024YOLOWorld undergoes pre-training on a large-scale dataset followed by fine-tuning on the COCO lin2014microsoft dataset. While Grounding DINO-B liu2023grounding is trained on a large-scale dataset including COCO lin2014microsoft.
  • Figure 3: Dataset collection pipeline. Initially, categories in ImageNet deng2009imagenet that semantically overlap with those in COCO lin2014microsoft are excluded. Subsequently, the COCO-trained detector Co-DETR zong2022detrs is utilized to identify categories that produce false positive predictions. Next, all images of the remaining categories are processed through Co-DETR zong2022detrs, with each image being filtered individually. Finally, to ensure dataset diversity and balanced distribution, we retain at most 100 images per category and only retain certain categories in case multiple categories in ImageNet deng2009imagenet are misidentified as the same category in COCO lin2014microsoft.
  • Figure 4: Visualization of false positive predictions on COCO-FP for different object detectors: (a) YOLOv9-E wang2024yolov9, (b) RTMDet-X lyu2022rtmdet, and (c) Grounding DINO-B liu2023grounding (training on COCO lin2014microsoft and without providing FP category as text prompt). Note that these false positive predictions have significant relatively high scores. Visual results with other object detectors are provided in the Appendix.
  • Figure 5: The impact of false positive bounding boxes scores on mAP. The horizontal axis represents the maximum score threshold applied to bounding boxes produced by YOLOv9-E wang2024yolov9 from the 3,772 images, while the vertical axis shows the corresponding mAP on COCO-FP dataset. This indicates that the detector produced a substantial number of high-scoring false positives, a critical issue for real-world applications.
  • ...and 3 more figures