Table of Contents
Fetching ...

Gaussian Combined Distance: A Generic Metric for Object Detection

Ziqian Guan, Xieyi Fu, Pengjun Huang, Hengyuan Zhang, Hubin Du, Yongtao Liu, Yinglin Wang, Qang Ma

TL;DR

Small object detection is challenged by scale variation and weak IoU gradients. We model bounding boxes as 2D Gaussians and introduce Gaussian Combined Distance ($D_{gc}^2$), a scale-invariant, jointly optimizing metric, complemented by a nonlinear normalization $M_{gcd}=exp(-sqrt(D_{gc}^2))$ to yield a bounded similarity. GCD serves as both a regression loss and a label-assignment metric, demonstrating state-of-the-art results on AI-TOD-v2 and strong generalization to VisDrone-2019 and MS-COCO-2017. The approach is implemented with open-source code and shows robust performance across scales, highlighting its practical impact for tiny-object detection in diverse datasets.

Abstract

In object detection, a well-defined similarity metric can significantly enhance model performance. Currently, the IoU-based similarity metric is the most commonly preferred choice for detectors. However, detectors using IoU as a similarity metric often perform poorly when detecting small objects because of their sensitivity to minor positional deviations. To address this issue, recent studies have proposed the Wasserstein Distance as an alternative to IoU for measuring the similarity of Gaussian-distributed bounding boxes. However, we have observed that the Wasserstein Distance lacks scale invariance, which negatively impacts the model's generalization capability. Additionally, when used as a loss function, its independent optimization of the center attributes leads to slow model convergence and unsatisfactory detection precision. To address these challenges, we introduce the Gaussian Combined Distance (GCD). Through analytical examination of GCD and its gradient, we demonstrate that GCD not only possesses scale invariance but also facilitates joint optimization, which enhances model localization performance. Extensive experiments on the AI-TOD-v2 dataset for tiny object detection show that GCD, as a bounding box regression loss function and label assignment metric, achieves state-of-the-art performance across various detectors. We further validated the generalizability of GCD on the MS-COCO-2017 and Visdrone-2019 datasets, where it outperforms the Wasserstein Distance across diverse scales of datasets. Code is available at https://github.com/MArKkwanGuan/mmdet-GCD.

Gaussian Combined Distance: A Generic Metric for Object Detection

TL;DR

Small object detection is challenged by scale variation and weak IoU gradients. We model bounding boxes as 2D Gaussians and introduce Gaussian Combined Distance (), a scale-invariant, jointly optimizing metric, complemented by a nonlinear normalization to yield a bounded similarity. GCD serves as both a regression loss and a label-assignment metric, demonstrating state-of-the-art results on AI-TOD-v2 and strong generalization to VisDrone-2019 and MS-COCO-2017. The approach is implemented with open-source code and shows robust performance across scales, highlighting its practical impact for tiny-object detection in diverse datasets.

Abstract

In object detection, a well-defined similarity metric can significantly enhance model performance. Currently, the IoU-based similarity metric is the most commonly preferred choice for detectors. However, detectors using IoU as a similarity metric often perform poorly when detecting small objects because of their sensitivity to minor positional deviations. To address this issue, recent studies have proposed the Wasserstein Distance as an alternative to IoU for measuring the similarity of Gaussian-distributed bounding boxes. However, we have observed that the Wasserstein Distance lacks scale invariance, which negatively impacts the model's generalization capability. Additionally, when used as a loss function, its independent optimization of the center attributes leads to slow model convergence and unsatisfactory detection precision. To address these challenges, we introduce the Gaussian Combined Distance (GCD). Through analytical examination of GCD and its gradient, we demonstrate that GCD not only possesses scale invariance but also facilitates joint optimization, which enhances model localization performance. Extensive experiments on the AI-TOD-v2 dataset for tiny object detection show that GCD, as a bounding box regression loss function and label assignment metric, achieves state-of-the-art performance across various detectors. We further validated the generalizability of GCD on the MS-COCO-2017 and Visdrone-2019 datasets, where it outperforms the Wasserstein Distance across diverse scales of datasets. Code is available at https://github.com/MArKkwanGuan/mmdet-GCD.

Paper Structure

This paper contains 14 sections, 9 equations, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Visualization results on AI-TOD-v2 with RetinaNet. From left to right, they are GCD, NWD, WD, and GIoU. Green boxes represent GT, and red boxes represent predicted boxes. Clearly, GCD shows the best detection performance.