Table of Contents
Fetching ...

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression

Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, Dongwei Ren

TL;DR

This work targets bounding-box regression in object detection, where traditional l-norm losses and even IoU/GIoU can yield slow convergence or poor localization. It introduces Distance-IoU (DIoU) loss, which incorporates a normalized center-point distance to IoU, and Complete IoU (CIoU) loss, which adds an aspect-ratio consistency term, leading to faster convergence and better accuracy. The authors demonstrate improved performance across YOLOv3, SSD, and Faster R-CNN on PASCAL VOC and MS COCO, and further boost results by using DIoU as the NMS criterion (DIoU-NMS). The methods are straightforward to integrate and are accompanied by public code and trained models.

Abstract

Bounding box regression is the crucial step in object detection. In existing methods, while $\ell_n$-norm loss is widely adopted for bounding box regression, it is not tailored to the evaluation metric, i.e., Intersection over Union (IoU). Recently, IoU loss and generalized IoU (GIoU) loss have been proposed to benefit the IoU metric, but still suffer from the problems of slow convergence and inaccurate regression. In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, \ie, overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. By incorporating DIoU and CIoU losses into state-of-the-art object detection algorithms, e.g., YOLO v3, SSD and Faster RCNN, we achieve notable performance gains in terms of not only IoU metric but also GIoU metric. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement. The source code and trained models are available at https://github.com/Zzh-tju/DIoU.

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression

TL;DR

This work targets bounding-box regression in object detection, where traditional l-norm losses and even IoU/GIoU can yield slow convergence or poor localization. It introduces Distance-IoU (DIoU) loss, which incorporates a normalized center-point distance to IoU, and Complete IoU (CIoU) loss, which adds an aspect-ratio consistency term, leading to faster convergence and better accuracy. The authors demonstrate improved performance across YOLOv3, SSD, and Faster R-CNN on PASCAL VOC and MS COCO, and further boost results by using DIoU as the NMS criterion (DIoU-NMS). The methods are straightforward to integrate and are accompanied by public code and trained models.

Abstract

Bounding box regression is the crucial step in object detection. In existing methods, while -norm loss is widely adopted for bounding box regression, it is not tailored to the evaluation metric, i.e., Intersection over Union (IoU). Recently, IoU loss and generalized IoU (GIoU) loss have been proposed to benefit the IoU metric, but still suffer from the problems of slow convergence and inaccurate regression. In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, \ie, overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. By incorporating DIoU and CIoU losses into state-of-the-art object detection algorithms, e.g., YOLO v3, SSD and Faster RCNN, we achieve notable performance gains in terms of not only IoU metric but also GIoU metric. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement. The source code and trained models are available at https://github.com/Zzh-tju/DIoU.

Paper Structure

This paper contains 20 sections, 13 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Bounding box regression steps by GIoU loss (first row) and DIoU loss (second row). Green and black denote target box and anchor box, respectively. Blue and red denote predicted boxes for GIoU loss and DIoU loss, respectively. GIoU loss generally increases the size of predicted box to overlap with target box, while DIoU loss directly minimizes normalized distance of central points.
  • Figure 2: GIoU loss degrades to IoU loss for these cases, while our DIoU loss is still distinguishable. Green and red denote target box and predicted box respectively.
  • Figure 3: Simulation experiments: (a) 1,715,000 regression cases are adopted by considering different distances, scales and aspect ratios, (b) regression error sum (i.e., $\sum_{n} \mathbf{E}(t,n)$) curves of different loss functions at iteration $t$.
  • Figure 4: Visualization of regression errors of IoU, GIoU and DIoU losses at the final iteration $T$, i.e.. $\mathbf{E}(T,n)$ for every coordinate $n$. We note that the basins in (a) and (b) correspond to good regression cases. One can see that IoU loss has large errors for non-overlapping cases, GIoU loss has large errors for horizontal and vertical cases, and our DIoU loss leads to very small regression errors everywhere.
  • Figure 5: DIoU loss for bounding box regression, where the normalized distance between central points can be directly minimized. $c$ is the diagonal length of the smallest enclosing box covering two boxes, and $d=\rho(\mathbf{b},\mathbf{b}^{gt})$ is the distance of central points of two boxes.
  • ...and 4 more figures