Table of Contents
Fetching ...

Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking

Xiaoyu Li, Dedong Liu, Yitao Wu, Xian Wu, Lijun Zhao, Jinghan Gao

TL;DR

Fast-Poly presents a CPU-friendly, learning-free polyhedral framework for 3D MOT that explicitly addresses rotation-induced affinity costs, localizes computations, and leverages parallelization. By aligning rotated objects in BEV and using an Aligned Generalized IoU, voxel masks, a lightweight time-invariant state filter, and a confidence-count lifecycle, the approach significantly reduces computation while preserving or improving tracking accuracy. Extensive experiments on nuScenes and Waymo demonstrate state-of-the-art AMOTA (75.8%) and competitive MOTA (63.6%) with practical runtimes (34.2–35.5 FPS), outperforming previous baselines like Poly-MOT. The combination of alignment, densification, and parallelization provides a robust baseline for real-time 3D MOT, with open-source implementation to facilitate adoption and further research.

Abstract

3D Multi-Object Tracking (MOT) captures stable and comprehensive motion states of surrounding obstacles, essential for robotic perception. However, current 3D trackers face issues with accuracy and latency consistency. In this paper, we propose Fast-Poly, a fast and effective filter-based method for 3D MOT. Building upon our previous work Poly-MOT, Fast-Poly addresses object rotational anisotropy in 3D space, enhances local computation densification, and leverages parallelization technique, improving inference speed and precision. Fast-Poly is extensively tested on two large-scale tracking benchmarks with Python implementation. On the nuScenes dataset, Fast-Poly achieves new state-of-the-art performance with 75.8% AMOTA among all methods and can run at 34.2 FPS on a personal CPU. On the Waymo dataset, Fast-Poly exhibits competitive accuracy with 63.6% MOTA and impressive inference speed (35.5 FPS). The source code is publicly available at https://github.com/lixiaoyu2000/FastPoly.

Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking

TL;DR

Fast-Poly presents a CPU-friendly, learning-free polyhedral framework for 3D MOT that explicitly addresses rotation-induced affinity costs, localizes computations, and leverages parallelization. By aligning rotated objects in BEV and using an Aligned Generalized IoU, voxel masks, a lightweight time-invariant state filter, and a confidence-count lifecycle, the approach significantly reduces computation while preserving or improving tracking accuracy. Extensive experiments on nuScenes and Waymo demonstrate state-of-the-art AMOTA (75.8%) and competitive MOTA (63.6%) with practical runtimes (34.2–35.5 FPS), outperforming previous baselines like Poly-MOT. The combination of alignment, densification, and parallelization provides a robust baseline for real-time 3D MOT, with open-source implementation to facilitate adoption and further research.

Abstract

3D Multi-Object Tracking (MOT) captures stable and comprehensive motion states of surrounding obstacles, essential for robotic perception. However, current 3D trackers face issues with accuracy and latency consistency. In this paper, we propose Fast-Poly, a fast and effective filter-based method for 3D MOT. Building upon our previous work Poly-MOT, Fast-Poly addresses object rotational anisotropy in 3D space, enhances local computation densification, and leverages parallelization technique, improving inference speed and precision. Fast-Poly is extensively tested on two large-scale tracking benchmarks with Python implementation. On the nuScenes dataset, Fast-Poly achieves new state-of-the-art performance with 75.8% AMOTA among all methods and can run at 34.2 FPS on a personal CPU. On the Waymo dataset, Fast-Poly exhibits competitive accuracy with 63.6% MOTA and impressive inference speed (35.5 FPS). The source code is publicly available at https://github.com/lixiaoyu2000/FastPoly.
Paper Structure (16 sections, 6 equations, 4 figures, 7 tables)

This paper contains 16 sections, 6 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 2: The pipeline of our proposed method. Our structure design is illustrated in \ref{['Architecture']}. Real-time improvements to the baseline li2023poly are highlighted in distinct colors. Orange denotes the Alignment to reduce the computational complexity of affinity calculations for rotated objects.Blue denotes the Densification to increase computational efficiency. Cyan denotes the Parallelization to execute pre-processing and motion prediction simultaneously, enhancing computational efficiency.
  • Figure 3: Top: Affinity difference between $A\text{-}gIoU$ and $gIoU$ for identical bbox pairs under varying geometric conditions in the 3D space. (a): overall affinity difference. (b): $IoU$ difference. (c): spatial occupancy difference. Bottom-Left: The calculation process of distinct metrics in the BEV space. Bottom-Right: The contributions of CGR across all trajectories within randomly sampled scenes from nuScenes of $A\text{-}gIoU$ and $gIoU$, employing our tracker with CenterPoint yin2021center detector.
  • Figure 4: The average timing statistics of each module in Fast-Poly on the nuScenes and Waymo val set without parallelization. Invariant means the lightweight filter for time-invariant states. Motion means the Kalman Filter for motion states. Life means the lifecycle module. Score means the score refinement.
  • Figure 5: The comparison of the accuracy under distinct newly introduced hyperparameter. No category-specific technique is performed, all categories are applied the same parameter.