Table of Contents
Fetching ...

Easy-Poly: A Easy Polyhedral Framework For 3D Multi-Object Tracking

Peng Zhang, Xin Li, Xin Lin, Liang He

TL;DR

Easy-Poly addresses the gap between 3D detection and MOT by introducing a real-time tracking pipeline that tightly couples enhanced proposals with robust data association and motion modeling. The approach combines an Augmented Proposal Generator, a Dynamic Track-Oriented Data Association, and a Dynamic Motion Modeling module with confidence-weighted updates and adaptive noise covariances, plus life-cycle adjustments. On nuScenes, Easy-Poly outperforms strong baselines in both detection (mAP/NDS) and MOT (AMOTA) metrics while maintaining real-time speed, demonstrating improved robustness in crowded scenes, small objects, and adverse weather. This work offers practical enhancements for autonomous driving perception and highlights the value of joint optimization of detection and tracking components.

Abstract

Recent advancements in 3D multi-object tracking (3D MOT) have predominantly relied on tracking-by-detection pipelines. However, these approaches often neglect potential enhancements in 3D detection processes, leading to high false positives (FP), missed detections (FN), and identity switches (IDS), particularly in challenging scenarios such as crowded scenes, small-object configurations, and adverse weather conditions. Furthermore, limitations in data preprocessing, association mechanisms, motion modeling, and life-cycle management hinder overall tracking robustness. To address these issues, we present Easy-Poly, a real-time, filter-based 3D MOT framework for multiple object categories. Our contributions include: (1) An Augmented Proposal Generator utilizing multi-modal data augmentation and refined SpConv operations, significantly improving mAP and NDS on nuScenes; (2) A Dynamic Track-Oriented (DTO) data association algorithm that effectively manages uncertainties and occlusions through optimal assignment and multiple hypothesis handling; (3) A Dynamic Motion Modeling (DMM) incorporating a confidence-weighted Kalman filter and adaptive noise covariances, enhancing MOTA and AMOTA in challenging conditions; and (4) An extended life-cycle management system with adjustive thresholds to reduce ID switches and false terminations. Experimental results show that Easy-Poly outperforms state-of-the-art methods such as Poly-MOT and Fast-Poly, achieving notable gains in mAP (e.g., from 63.30% to 64.96% with LargeKernel3D) and AMOTA (e.g., from 73.1% to 74.5%), while also running in real-time. These findings highlight Easy-Poly's adaptability and robustness in diverse scenarios, making it a compelling choice for autonomous driving and related 3D MOT applications. The source code of this paper will be published upon acceptance.

Easy-Poly: A Easy Polyhedral Framework For 3D Multi-Object Tracking

TL;DR

Easy-Poly addresses the gap between 3D detection and MOT by introducing a real-time tracking pipeline that tightly couples enhanced proposals with robust data association and motion modeling. The approach combines an Augmented Proposal Generator, a Dynamic Track-Oriented Data Association, and a Dynamic Motion Modeling module with confidence-weighted updates and adaptive noise covariances, plus life-cycle adjustments. On nuScenes, Easy-Poly outperforms strong baselines in both detection (mAP/NDS) and MOT (AMOTA) metrics while maintaining real-time speed, demonstrating improved robustness in crowded scenes, small objects, and adverse weather. This work offers practical enhancements for autonomous driving perception and highlights the value of joint optimization of detection and tracking components.

Abstract

Recent advancements in 3D multi-object tracking (3D MOT) have predominantly relied on tracking-by-detection pipelines. However, these approaches often neglect potential enhancements in 3D detection processes, leading to high false positives (FP), missed detections (FN), and identity switches (IDS), particularly in challenging scenarios such as crowded scenes, small-object configurations, and adverse weather conditions. Furthermore, limitations in data preprocessing, association mechanisms, motion modeling, and life-cycle management hinder overall tracking robustness. To address these issues, we present Easy-Poly, a real-time, filter-based 3D MOT framework for multiple object categories. Our contributions include: (1) An Augmented Proposal Generator utilizing multi-modal data augmentation and refined SpConv operations, significantly improving mAP and NDS on nuScenes; (2) A Dynamic Track-Oriented (DTO) data association algorithm that effectively manages uncertainties and occlusions through optimal assignment and multiple hypothesis handling; (3) A Dynamic Motion Modeling (DMM) incorporating a confidence-weighted Kalman filter and adaptive noise covariances, enhancing MOTA and AMOTA in challenging conditions; and (4) An extended life-cycle management system with adjustive thresholds to reduce ID switches and false terminations. Experimental results show that Easy-Poly outperforms state-of-the-art methods such as Poly-MOT and Fast-Poly, achieving notable gains in mAP (e.g., from 63.30% to 64.96% with LargeKernel3D) and AMOTA (e.g., from 73.1% to 74.5%), while also running in real-time. These findings highlight Easy-Poly's adaptability and robustness in diverse scenarios, making it a compelling choice for autonomous driving and related 3D MOT applications. The source code of this paper will be published upon acceptance.

Paper Structure

This paper contains 18 sections, 5 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Data augmentation effect of CenterPoint and LargeKernel3D in multi-modal mode. It illustrates the performance gains of FocalsConv in 3D object detection. (a) and (b) present CenterPoint results, while (c) and (d) showcase LargeKernel3D outcomes. Notably, (a) and (c) utilize the baseline FocalsConv model, whereas (b) and (d) employ our augmented proposal generator, demonstrating reduced false negatives and false positives. This comparison also highlights our augmented proposal generator's superiority over original detectors particularly in handling small objects and crowded scenes.
  • Figure 2: The pipeline of our Easy-Poly method. Real-time improvements to the baseline li2024fast are highlighted in distinct colors. Pink denotes the Optimization of Fast-Poly.Purple denotes the new functional modules.
  • Figure 3: The ablation study of whether or not to use Score Filter and Non-Maximum Suppression, including the Run-Time, which represents the execution time of the Pre-processing Module. We compared Poly-MOT with our proposed Easy-Poly method.