Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations
Amit Galor, Roy Orfaig, Ben-Zion Bobrovsky
TL;DR
This work tackles the bottleneck of motion estimation in Transformer-based multi-object tracking by augmenting TransCenter with a fine-tuned Kalman filter and an appearance embedding network. A heatmap-derived noise model informs the Kalman measurement noise, while a FastReID-based embedding supports robust re-identification within a three-stage cascade association. Empirical results on MOT17 and MOT20 show STC achieving higher HOTA and IDF1 than other Transformer-based trackers, with MOTA remaining competitive and a modest runtime impact. The findings suggest that targeted post-processing and re-ID integration can substantially improve tracker robustness, potentially guiding future all-in-one Transformer MOT designs.
Abstract
Transformer networks have been a focus of research in many fields in recent years, being able to surpass the state-of-the-art performance in different computer vision tasks. However, in the task of Multiple Object Tracking (MOT), leveraging the power of Transformers remains relatively unexplored. Among the pioneering efforts in this domain, TransCenter, a Transformer-based MOT architecture with dense object queries, demonstrated exceptional tracking capabilities while maintaining reasonable runtime. Nonetheless, one critical aspect in MOT, track displacement estimation, presents room for enhancement to further reduce association errors. In response to this challenge, our paper introduces a novel improvement to TransCenter. We propose a post-processing mechanism grounded in the Track-by-Detection paradigm, aiming to refine the track displacement estimation. Our approach involves the integration of a carefully designed Kalman filter, which incorporates Transformer outputs into measurement error estimation, and the use of an embedding network for target re-identification. This combined strategy yields substantial improvement in the accuracy and robustness of the tracking process. We validate our contributions through comprehensive experiments on the MOTChallenge datasets MOT17 and MOT20, where our proposed approach outperforms other Transformer-based trackers. The code is publicly available at: https://github.com/amitgalor18/STC_Tracker
