Table of Contents
Fetching ...

Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies

Nieves Crasto

TL;DR

This paper addresses foreground-foreground class imbalance in single-stage object detection, focusing on YOLOv5s and edge-deployed scenarios. It introduces COCO-ZIPF, a 10-class long-tailed subset of COCO, and a PyTorch-based benchmarking framework to evaluate imbalance mitigation strategies, specifically sampling, loss weighting, and augmentation. The study finds that sampling and loss reweighting offer limited or negative benefits for YOLOv5 on COCO-ZIPF, while mosaic and mixup augmentations consistently improve mean Average Precision ($\text{mAP}$), with mosaic+mixup providing the strongest gains. The work provides practical guidance for handling class imbalance in lightweight detectors and releases code to support reproducibility and further research.

Abstract

Object detection, a pivotal task in computer vision, is frequently hindered by dataset imbalances, particularly the under-explored issue of foreground-foreground class imbalance. This lack of attention to foreground-foreground class imbalance becomes even more pronounced in the context of single-stage detectors. This study introduces a benchmarking framework utilizing the YOLOv5 single-stage detector to address the problem of foreground-foreground class imbalance. We crafted a novel 10-class long-tailed dataset from the COCO dataset, termed COCO-ZIPF, tailored to reflect common real-world detection scenarios with a limited number of object classes. Against this backdrop, we scrutinized three established techniques: sampling, loss weighing, and data augmentation. Our comparative analysis reveals that sampling and loss reweighing methods, while shown to be beneficial in two-stage detector settings, do not translate as effectively in improving YOLOv5's performance on the COCO-ZIPF dataset. On the other hand, data augmentation methods, specifically mosaic and mixup, significantly enhance the model's mean Average Precision (mAP), by introducing more variability and complexity into the training data. (Code available: https://github.com/craston/object_detection_cib)

Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies

TL;DR

This paper addresses foreground-foreground class imbalance in single-stage object detection, focusing on YOLOv5s and edge-deployed scenarios. It introduces COCO-ZIPF, a 10-class long-tailed subset of COCO, and a PyTorch-based benchmarking framework to evaluate imbalance mitigation strategies, specifically sampling, loss weighting, and augmentation. The study finds that sampling and loss reweighting offer limited or negative benefits for YOLOv5 on COCO-ZIPF, while mosaic and mixup augmentations consistently improve mean Average Precision (), with mosaic+mixup providing the strongest gains. The work provides practical guidance for handling class imbalance in lightweight detectors and releases code to support reproducibility and further research.

Abstract

Object detection, a pivotal task in computer vision, is frequently hindered by dataset imbalances, particularly the under-explored issue of foreground-foreground class imbalance. This lack of attention to foreground-foreground class imbalance becomes even more pronounced in the context of single-stage detectors. This study introduces a benchmarking framework utilizing the YOLOv5 single-stage detector to address the problem of foreground-foreground class imbalance. We crafted a novel 10-class long-tailed dataset from the COCO dataset, termed COCO-ZIPF, tailored to reflect common real-world detection scenarios with a limited number of object classes. Against this backdrop, we scrutinized three established techniques: sampling, loss weighing, and data augmentation. Our comparative analysis reveals that sampling and loss reweighing methods, while shown to be beneficial in two-stage detector settings, do not translate as effectively in improving YOLOv5's performance on the COCO-ZIPF dataset. On the other hand, data augmentation methods, specifically mosaic and mixup, significantly enhance the model's mean Average Precision (mAP), by introducing more variability and complexity into the training data. (Code available: https://github.com/craston/object_detection_cib)
Paper Structure (17 sections, 9 equations, 2 figures, 1 table)

This paper contains 17 sections, 9 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Comparative visualization of class instance distributions in the COCO-ZIPF dataset. The left chart (a) displays the count of image instances per class. The right chart (b) represents the number of instances per class. Both charts exhibit a Zipfian distribution, highlighting the long-tail effect in class representation within the dataset.
  • Figure 2: