Table of Contents
Fetching ...

Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Keisuke Toida, Naoki Kato, Osamu Segawa, Takeshi Nakamura, Kazuhiro Hotta

TL;DR

The Gr-IoU method, which takes into account the 3D structure of the scene, outperforms conventional real-time methods without appearance features and is more sensitive to the front-to-back relationships of objects, thereby improving data association accuracy and reducing ID switches.

Abstract

We propose a Ground IoU (Gr-IoU) to address the data association problem in multi-object tracking. When tracking objects detected by a camera, it often occurs that the same object is assigned different IDs in consecutive frames, especially when objects are close to each other or overlapping. To address this issue, we introduce Gr-IoU, which takes into account the 3D structure of the scene. Gr-IoU transforms traditional bounding boxes from the image space to the ground plane using the vanishing point geometry. The IoU calculated with these transformed bounding boxes is more sensitive to the front-to-back relationships of objects, thereby improving data association accuracy and reducing ID switches. We evaluated our Gr-IoU method on the MOT17 and MOT20 datasets, which contain diverse tracking scenarios including crowded scenes and sequences with frequent occlusions. Experimental results demonstrated that Gr-IoU outperforms conventional real-time methods without appearance features.

Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

TL;DR

The Gr-IoU method, which takes into account the 3D structure of the scene, outperforms conventional real-time methods without appearance features and is more sensitive to the front-to-back relationships of objects, thereby improving data association accuracy and reducing ID switches.

Abstract

We propose a Ground IoU (Gr-IoU) to address the data association problem in multi-object tracking. When tracking objects detected by a camera, it often occurs that the same object is assigned different IDs in consecutive frames, especially when objects are close to each other or overlapping. To address this issue, we introduce Gr-IoU, which takes into account the 3D structure of the scene. Gr-IoU transforms traditional bounding boxes from the image space to the ground plane using the vanishing point geometry. The IoU calculated with these transformed bounding boxes is more sensitive to the front-to-back relationships of objects, thereby improving data association accuracy and reducing ID switches. We evaluated our Gr-IoU method on the MOT17 and MOT20 datasets, which contain diverse tracking scenarios including crowded scenes and sequences with frequent occlusions. Experimental results demonstrated that Gr-IoU outperforms conventional real-time methods without appearance features.
Paper Structure (13 sections, 4 equations, 7 figures, 3 tables)

This paper contains 13 sections, 4 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: conventional b-boxes
  • Figure 2: transformed b-boxes
  • Figure 4: Our tracking pipeline. Our tracking pipeline follows the paradigm proposed in zhang2022bytetrack, dividing the matching process based on the detection precision score (det).
  • Figure 5: Overview of Gr-IoU. In conventional cost matrix calculations, standard IoU in camera space is used. In our method, we transform the coordinates of the detected bounding boxes by projecting them onto the ground plane. By calculating IoU using these transformed rectangles, we alleviate redundancy in the cost matrix and enable more efficient matching for close objects and occlusions.
  • Figure 6: Conventional IoU
  • ...and 2 more figures