Table of Contents
Fetching ...

Extending Dataset Pruning to Object Detection: A Variance-based Approach

Ryota Yagi

TL;DR

This work tackles the challenge of extending dataset pruning to object detection by formulating a principled pruning objective for detection and addressing three core problems: Object-Level Attribution, Scoring Strategy, and Image-Level Aggregation. It introduces Variance-based Prediction Score (VPS), which leverages the variance of IoU and confidence outputs across training epochs to rank informative samples, coupled with a Class-Prioritized IoU Aware Prediction Assignment (CIPA) for object-level attribution and a simple aggregation to image-level scores. Empirical results on PASCAL VOC and MS COCO show that VPS-based pruning consistently outperforms traditional pruning baselines in mean Average Precision (mAP) across pruning ratios and generalizes across architectures. The findings suggest that selecting informative examples is more critical than merely increasing dataset size or balancing classes, paving the way for efficient training in complex vision tasks while highlighting practical societal benefits such as reduced training costs and resource use.

Abstract

Dataset pruning -- selecting a small yet informative subset of training data -- has emerged as a promising strategy for efficient machine learning, offering significant reductions in computational cost and storage compared to alternatives like dataset distillation. While pruning methods have shown strong performance in image classification, their extension to more complex computer vision tasks, particularly object detection, remains relatively underexplored. In this paper, we present the first principled extension of classification pruning techniques to the object detection domain, to the best of our knowledge. We identify and address three key challenges that hinder this transition: the Object-Level Attribution Problem, the Scoring Strategy Problem, and the Image-Level Aggregation Problem. To overcome these, we propose tailored solutions, including a novel scoring method called Variance-based Prediction Score (VPS). VPS leverages both Intersection over Union (IoU) and confidence scores to effectively identify informative training samples specific to detection tasks. Extensive experiments on PASCAL VOC and MS COCO demonstrate that our approach consistently outperforms prior dataset pruning methods in terms of mean Average Precision (mAP). We also show that annotation count and class distribution shift can influence detection performance, but selecting informative examples is a more critical factor than dataset size or balance. Our work bridges dataset pruning and object detection, paving the way for dataset pruning in complex vision tasks.

Extending Dataset Pruning to Object Detection: A Variance-based Approach

TL;DR

This work tackles the challenge of extending dataset pruning to object detection by formulating a principled pruning objective for detection and addressing three core problems: Object-Level Attribution, Scoring Strategy, and Image-Level Aggregation. It introduces Variance-based Prediction Score (VPS), which leverages the variance of IoU and confidence outputs across training epochs to rank informative samples, coupled with a Class-Prioritized IoU Aware Prediction Assignment (CIPA) for object-level attribution and a simple aggregation to image-level scores. Empirical results on PASCAL VOC and MS COCO show that VPS-based pruning consistently outperforms traditional pruning baselines in mean Average Precision (mAP) across pruning ratios and generalizes across architectures. The findings suggest that selecting informative examples is more critical than merely increasing dataset size or balancing classes, paving the way for efficient training in complex vision tasks while highlighting practical societal benefits such as reduced training costs and resource use.

Abstract

Dataset pruning -- selecting a small yet informative subset of training data -- has emerged as a promising strategy for efficient machine learning, offering significant reductions in computational cost and storage compared to alternatives like dataset distillation. While pruning methods have shown strong performance in image classification, their extension to more complex computer vision tasks, particularly object detection, remains relatively underexplored. In this paper, we present the first principled extension of classification pruning techniques to the object detection domain, to the best of our knowledge. We identify and address three key challenges that hinder this transition: the Object-Level Attribution Problem, the Scoring Strategy Problem, and the Image-Level Aggregation Problem. To overcome these, we propose tailored solutions, including a novel scoring method called Variance-based Prediction Score (VPS). VPS leverages both Intersection over Union (IoU) and confidence scores to effectively identify informative training samples specific to detection tasks. Extensive experiments on PASCAL VOC and MS COCO demonstrate that our approach consistently outperforms prior dataset pruning methods in terms of mean Average Precision (mAP). We also show that annotation count and class distribution shift can influence detection performance, but selecting informative examples is a more critical factor than dataset size or balance. Our work bridges dataset pruning and object detection, paving the way for dataset pruning in complex vision tasks.

Paper Structure

This paper contains 39 sections, 4 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the score assignment process for object detection. For each ground-truth box, the most plausible prediction is selected (teal; Section \ref{['sec:defprediction']}). Scores are then computed from the associated model outputs (yellow; Sections \ref{['sec:traditional']}, \ref{['sec:VPS']}), aggregated (orange; Section \ref{['sec:aggscore']}), and assigned to the image.
  • Figure 2: Visualization of VPS scores (IoU, confidence) for object in PASCAL VOC dataset. The subplots display average score (y-axis) vs. standard deviation (x-axis), colored by correctness or forgetting events. (a, b) show VPSiou, and (c, d) show VPSconf. (a, c) are colored by correctness, while (b, d) are colored by forgetting events.
  • Figure 3: Pruned samples with ground truth bounding box based on low VPSiou scores using max aggregation from PASCAL VOC everingham2010PASCAL. (a, b) show examples with high IoUmean scores (easy), where foreground and background are clearly separable. (c, d) show difficult cases with low IoUmean; (c) has a distant car, (d) a black sheep head against a dark background.
  • Figure 4: mAP comparison of different aggregation methods (max, average, sum) on the PASCAL VOC everingham2010PASCAL dataset across varying pruning rates.
  • Figure 5: Plots showing averaged mAP vs. number of annotations and JS Divergence under high (90%) and low (30%) pruning on PASCAL VOC, with correlation coefficients (r) annotated.
  • ...and 4 more figures