UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation
Kefu Yi, Kai Luo, Xiaolei Luo, Jiangui Huang, Hao Wu, Rongdong Hu, Wei Hao
TL;DR
This paper addresses multi-object tracking under challenging camera motion by designing UCMCTrack, a pure motion-model tracker that operates on the ground plane. It replaces frame-by-frame camera motion compensation with a single sequence-wide parameter set and adopts a Kalman filter on ground-plane states, using the Mapped Mahalanobis Distance to perform data association with ground-plane uncertainty modeled explicitly. Through correlated ground-plane measurements and a process noise compensation scheme, UCMCTrack achieves state-of-the-art results on MOT17, MOT20, DanceTrack, and KITTI, while maintaining real-time efficiency (>$1000$ FPS on CPU). The work highlights the benefit of grounding motion modeling in the ground plane and suggests future improvements by integrating traditional appearance cues (IoU, ReID) to further enhance robustness and generalization.
Abstract
Multi-object tracking (MOT) in video sequences remains a challenging task, especially in scenarios with significant camera movements. This is because targets can drift considerably on the image plane, leading to erroneous tracking outcomes. Addressing such challenges typically requires supplementary appearance cues or Camera Motion Compensation (CMC). While these strategies are effective, they also introduce a considerable computational burden, posing challenges for real-time MOT. In response to this, we introduce UCMCTrack, a novel motion model-based tracker robust to camera movements. Unlike conventional CMC that computes compensation parameters frame-by-frame, UCMCTrack consistently applies the same compensation parameters throughout a video sequence. It employs a Kalman filter on the ground plane and introduces the Mapped Mahalanobis Distance (MMD) as an alternative to the traditional Intersection over Union (IoU) distance measure. By leveraging projected probability distributions on the ground plane, our approach efficiently captures motion patterns and adeptly manages uncertainties introduced by homography projections. Remarkably, UCMCTrack, relying solely on motion cues, achieves state-of-the-art performance across a variety of challenging datasets, including MOT17, MOT20, DanceTrack and KITTI. More details and code are available at https://github.com/corfyi/UCMCTrack
