MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving
Xiyang Wang, Shouzheng Qi, Jieyou Zhao, Hangning Zhou, Siyu Zhang, Guoan Wang, Kai Tu, Songlin Guo, Jianbo Zhao, Jian Li, Mu Yang
TL;DR
MCTrack tackles the lack of generalizability in 3D MOT by introducing a unified TBM framework that operates on a standardized BaseVersion format across KITTI, nuScenes, and Waymo. The core innovations are a decoupled Kalman-filter design for position, size, and heading, and Ro_GDIoU-based two-stage matching that combines BEV and RV perspectives to robustly associate trajectories. The paper also proposes motion-centric evaluation metrics (e.g., VAE, VNE, VDE) to quantify downstream-relevant motion outputs like velocity and acceleration. Empirically, MCTrack achieves SOTA performance on multiple datasets and demonstrates that Ro_GDIoU and secondary RV matching improve robustness, while BaseVersion reduces cross-dataset preprocessing burdens for researchers and practitioners.
Abstract
This paper introduces MCTrack, a new 3D multi-object tracking method that achieves state-of-the-art (SOTA) performance across KITTI, nuScenes, and Waymo datasets. Addressing the gap in existing tracking paradigms, which often perform well on specific datasets but lack generalizability, MCTrack offers a unified solution. Additionally, we have standardized the format of perceptual results across various datasets, termed BaseVersion, facilitating researchers in the field of multi-object tracking (MOT) to concentrate on the core algorithmic development without the undue burden of data preprocessing. Finally, recognizing the limitations of current evaluation metrics, we propose a novel set that assesses motion information output, such as velocity and acceleration, crucial for downstream tasks. The source codes of the proposed method are available at this link: https://github.com/megvii-research/MCTrack}{https://github.com/megvii-research/MCTrack
