Table of Contents
Fetching ...

Distribution-Aware Calibration for Object Detection with Noisy Bounding Boxes

Donghao Zhou, Jialin Li, Jinpeng Li, Jiancheng Huang, Qiang Nie, Yong Liu, Bin-Bin Gao, Qiong Wang, Pheng-Ann Heng, Guangyong Chen

TL;DR

DIStribution-aware CalibratiOn (DISCO) is proposed to model the spatial distribution of proposals for calibrating supervision signals and three distribution-aware techniques are developed to improve classification, localization, and interpretability.

Abstract

Large-scale well-annotated datasets are of great importance for training an effective object detector. However, obtaining accurate bounding box annotations is laborious and demanding. Unfortunately, the resultant noisy bounding boxes could cause corrupt supervision signals and thus diminish detection performance. Motivated by the observation that the real ground-truth is usually situated in the aggregation region of the proposals assigned to a noisy ground-truth, we propose DIStribution-aware CalibratiOn (DISCO) to model the spatial distribution of proposals for calibrating supervision signals. In DISCO, spatial distribution modeling is performed to statistically extract the potential locations of objects. Based on the modeled distribution, three distribution-aware techniques, i.e., distribution-aware proposal augmentation (DA-Aug), distribution-aware box refinement (DA-Ref), and distribution-aware confidence estimation (DA-Est), are developed to improve classification, localization, and interpretability, respectively. Extensive experiments on large-scale noisy image datasets (i.e., Pascal VOC and MS-COCO) demonstrate that DISCO can achieve state-of-the-art detection performance, especially at high noise levels. Code is available at https://github.com/Correr-Zhou/DISCO.

Distribution-Aware Calibration for Object Detection with Noisy Bounding Boxes

TL;DR

DIStribution-aware CalibratiOn (DISCO) is proposed to model the spatial distribution of proposals for calibrating supervision signals and three distribution-aware techniques are developed to improve classification, localization, and interpretability.

Abstract

Large-scale well-annotated datasets are of great importance for training an effective object detector. However, obtaining accurate bounding box annotations is laborious and demanding. Unfortunately, the resultant noisy bounding boxes could cause corrupt supervision signals and thus diminish detection performance. Motivated by the observation that the real ground-truth is usually situated in the aggregation region of the proposals assigned to a noisy ground-truth, we propose DIStribution-aware CalibratiOn (DISCO) to model the spatial distribution of proposals for calibrating supervision signals. In DISCO, spatial distribution modeling is performed to statistically extract the potential locations of objects. Based on the modeled distribution, three distribution-aware techniques, i.e., distribution-aware proposal augmentation (DA-Aug), distribution-aware box refinement (DA-Ref), and distribution-aware confidence estimation (DA-Est), are developed to improve classification, localization, and interpretability, respectively. Extensive experiments on large-scale noisy image datasets (i.e., Pascal VOC and MS-COCO) demonstrate that DISCO can achieve state-of-the-art detection performance, especially at high noise levels. Code is available at https://github.com/Correr-Zhou/DISCO.
Paper Structure (23 sections, 15 equations, 7 figures, 6 tables)

This paper contains 23 sections, 15 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: (a) Trait comparison of existing solutions and our DISCO. Their learning behaviors for one single border of bounding boxes are presented above. (b) Proposal aggregation in object detection with noisy bounding boxes. The real ground-truth is usually situated in the aggregation region of the proposals assigned to a noisy ground-truth.
  • Figure 2: Training pipeline with DISCO. In DISCO, spatial distribution modeling (Section \ref{['sec:SDM']}) is performed firstly, followed by three distribution-aware techniques, i.e., DA-Aug (Section \ref{['sec:DA-Aug']}), DA-Ref (Section \ref{['sec:DA-Ref']}), and DA-Est (Section \ref{['sec:DA-Est']}), to collaborate with the modeled distribution. Note that we additionally integrate an estimator into the vanilla detection head to construct the distribution-aware head (DA head) for the implementation of DISCO.
  • Figure 3: Illustration of spatial distribution modeling. For clarity, we present the process in the view of the whole box and one single border. Note that the length of the vertical line indicates its weight. Here the refined ground-truth is essentially a spatial distribution.
  • Figure 4: Illustration of classification performance improvement. Left: Average classification scores of the positive proposals for corresponding categories. Right: Classification accuracy of the positive proposals. Compared to OA-MIL, DISCO provides better improvement for classification, which even approaches the results of training with clean annotations.
  • Figure 5: (a) Qualitative results of box refinement in DISCO. Real ground-truths and noisy ground-truths are marked in orange and blue. Refined bounding boxes produced by the first-/second-time DISCO are indicated in dotted/solid red. The first-time refined boxes can cover the objects more tightly than noisy ground-truths, and the second-time refinement can further contribute to more precise ones. (b) Qualitative results of interpretability in DISCO. We randomly choose an assigned proposal (yellow) per image to report its estimated variances. Real ground-truths and noisy ground-truths are marked in orange and blue. Note that the variance is scaled by the width and height for clarity. With the proposed DA-Est, DISCO can estimate reasonable variances for each border of box prediction.
  • ...and 2 more figures