Table of Contents
Fetching ...

Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection

Jeffri M. Llerena, Luis Felipe Zeni, Lucas N. Kristen, Claudio Jung

Abstract

Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a fuzzy representation of object regions using Gaussian distributions, which provides an implicit binary representation as (potentially rotated) ellipses. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a Probabilistic Intersection-over-Union (ProbIoU). Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in publicly available datasets, and that loss functions based on ProbIoU can be successfully used to regress the parameters of the Gaussian representation. Furthermore, we present a simple mapping scheme from traditional (or rotated) bounding boxes to Gaussian representations, allowing the proposed ProbIoU-based losses to be seamlessly integrated into any object detector.

Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection

Abstract

Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a fuzzy representation of object regions using Gaussian distributions, which provides an implicit binary representation as (potentially rotated) ellipses. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a Probabilistic Intersection-over-Union (ProbIoU). Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in publicly available datasets, and that loss functions based on ProbIoU can be successfully used to regress the parameters of the Gaussian representation. Furthermore, we present a simple mapping scheme from traditional (or rotated) bounding boxes to Gaussian representations, allowing the proposed ProbIoU-based losses to be seamlessly integrated into any object detector.

Paper Structure

This paper contains 23 sections, 18 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Different annotation types for images in the COCO dataset: HBBs (red), OBBs (blue) and GBBs (green), as well as the GT segmentation masks.
  • Figure 2: Empirical comparison between IoU with HBBs and ProbIoU with (a) the respective GBBs and (b) uniform PDFs. We used a set of 5,000,000 pairs of random HBBs. The red line is the identity function.
  • Figure 3: IoU between segmentation masks and HBBs, OBBs and GBBs for each category of the COCO 2017 train set.
  • Figure 4: Top: object representations as HBBs (red), OBBs (blue) and GBB-induced ellipses (green). Bottom: corresponding segmentation masks.
  • Figure 5: OBB detection results (in green) for VOC dataset using SSD trained with our ProbIoU-based loss functions (ground truth HBBs shown in red).
  • ...and 3 more figures