History-Aware Transformation of ReID Features for Multiple Object Tracking

Ruopeng Gao; Yuyao Wang; Chunxu Liu; Limin Wang

History-Aware Transformation of ReID Features for Multiple Object Tracking

Ruopeng Gao, Yuyao Wang, Chunxu Liu, Limin Wang

TL;DR

This work tackles multi-object tracking by arguing that generic ReID features are suboptimal for distinguishing similar targets within a single video sequence. It introduces a training-free History-Aware Projection using Fisher Linear Discriminant to compute a per-sequence projection $W$, transforming features via $\boldsymbol{f}'=\boldsymbol{f}W$ and selecting a reduced dimension $D'$. It further enhances robustness with a Temporal-Shifted Trajectory Centroid that emphasizes recent observations, and combines the transformed and original feature spaces through Knowledge Integration using $\cos(i,j)=\alpha\cos(\boldsymbol{f}_i',\hat{\boldsymbol{f}}_j')+(1-\alpha)\cos(\boldsymbol{f}_i,\hat{\boldsymbol{f}}_j)$. Extensive experiments on DanceTrack, MOT17/16, SportsMOT, and TAO show significant performance gains, including strong zero-shot transfer, demonstrating the value of sequence-tailored ReID representations for MOT.

Abstract

The aim of multiple object tracking (MOT) is to detect all objects in a video and bind them into multiple trajectories. Generally, this process is carried out in two steps: detecting objects and associating them across frames based on various cues and metrics. Many studies and applications adopt object appearance, also known as re-identification (ReID) features, for target matching through straightforward similarity calculation. However, we argue that this practice is overly naive and thus overlooks the unique characteristics of MOT tasks. Unlike regular re-identification tasks that strive to distinguish all potential targets in a general representation, multi-object tracking typically immerses itself in differentiating similar targets within the same video sequence. Therefore, we believe that seeking a more suitable feature representation space based on the different sample distributions of each sequence will enhance tracking performance. In this paper, we propose using history-aware transformations on ReID features to achieve more discriminative appearance representations. Specifically, we treat historical trajectory features as conditions and employ a tailored Fisher Linear Discriminant (FLD) to find a spatial projection matrix that maximizes the differentiation between different trajectories. Our extensive experiments reveal that this training-free projection can significantly boost feature-only trackers to achieve competitive, even superior tracking performance compared to state-of-the-art methods while also demonstrating impressive zero-shot transfer capabilities. This demonstrates the effectiveness of our proposal and further encourages future investigation into the importance and customization of ReID models in multiple object tracking. The code will be released at https://github.com/HELLORPG/HATReID-MOT.

History-Aware Transformation of ReID Features for Multiple Object Tracking

TL;DR

Abstract

History-Aware Transformation of ReID Features for Multiple Object Tracking

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)