Table of Contents
Fetching ...

Delving into the Trajectory Long-tail Distribution for Muti-object Tracking

Sijia Chen, En Yu, Jinyang Li, Wenbing Tao

TL;DR

This work identifies a pronounced long-tail distribution in MOT trajectory lengths that biases tail-class learning. It introduces two viewpoint-aware data augmentations, Stationary Camera View Data Augmentation (SVA) and Dynamic Camera View Data Augmentation (DVA), along with a Group Softmax (GS) module to balance Re-ID across pedestrian classes with differing sample counts. The method, validated on MOT15/16/17/20 with two SOTA trackers, yields consistent improvements in MOTA, IDF1, and HOTA and demonstrates data-efficiency benefits when training data is limited. By addressing both information augmentation and classifier balance, the approach enhances tail-class recognition and robust trajectory association, particularly in crowded scenes where long tails are most severe.

Abstract

Multiple Object Tracking (MOT) is a critical area within computer vision, with a broad spectrum of practical implementations. Current research has primarily focused on the development of tracking algorithms and enhancement of post-processing techniques. Yet, there has been a lack of thorough examination concerning the nature of tracking data it self. In this study, we pioneer an exploration into the distribution patterns of tracking data and identify a pronounced long-tail distribution issue within existing MOT datasets. We note a significant imbalance in the distribution of trajectory lengths across different pedestrians, a phenomenon we refer to as ``pedestrians trajectory long-tail distribution''. Addressing this challenge, we introduce a bespoke strategy designed to mitigate the effects of this skewed distribution. Specifically, we propose two data augmentation strategies, including Stationary Camera View Data Augmentation (SVA) and Dynamic Camera View Data Augmentation (DVA) , designed for viewpoint states and the Group Softmax (GS) module for Re-ID. SVA is to backtrack and predict the pedestrian trajectory of tail classes, and DVA is to use diffusion model to change the background of the scene. GS divides the pedestrians into unrelated groups and performs softmax operation on each group individually. Our proposed strategies can be integrated into numerous existing tracking systems, and extensive experimentation validates the efficacy of our method in reducing the influence of long-tail distribution on multi-object tracking performance. The code is available at https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT.

Delving into the Trajectory Long-tail Distribution for Muti-object Tracking

TL;DR

This work identifies a pronounced long-tail distribution in MOT trajectory lengths that biases tail-class learning. It introduces two viewpoint-aware data augmentations, Stationary Camera View Data Augmentation (SVA) and Dynamic Camera View Data Augmentation (DVA), along with a Group Softmax (GS) module to balance Re-ID across pedestrian classes with differing sample counts. The method, validated on MOT15/16/17/20 with two SOTA trackers, yields consistent improvements in MOTA, IDF1, and HOTA and demonstrates data-efficiency benefits when training data is limited. By addressing both information augmentation and classifier balance, the approach enhances tail-class recognition and robust trajectory association, particularly in crowded scenes where long tails are most severe.

Abstract

Multiple Object Tracking (MOT) is a critical area within computer vision, with a broad spectrum of practical implementations. Current research has primarily focused on the development of tracking algorithms and enhancement of post-processing techniques. Yet, there has been a lack of thorough examination concerning the nature of tracking data it self. In this study, we pioneer an exploration into the distribution patterns of tracking data and identify a pronounced long-tail distribution issue within existing MOT datasets. We note a significant imbalance in the distribution of trajectory lengths across different pedestrians, a phenomenon we refer to as ``pedestrians trajectory long-tail distribution''. Addressing this challenge, we introduce a bespoke strategy designed to mitigate the effects of this skewed distribution. Specifically, we propose two data augmentation strategies, including Stationary Camera View Data Augmentation (SVA) and Dynamic Camera View Data Augmentation (DVA) , designed for viewpoint states and the Group Softmax (GS) module for Re-ID. SVA is to backtrack and predict the pedestrian trajectory of tail classes, and DVA is to use diffusion model to change the background of the scene. GS divides the pedestrians into unrelated groups and performs softmax operation on each group individually. Our proposed strategies can be integrated into numerous existing tracking systems, and extensive experimentation validates the efficacy of our method in reducing the influence of long-tail distribution on multi-object tracking performance. The code is available at https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT.
Paper Structure (21 sections, 6 equations, 15 figures, 11 tables)

This paper contains 21 sections, 6 equations, 15 figures, 11 tables.

Figures (15)

  • Figure 1: The number of frames of pedestrians with different identities in the MOTChallenge datasets. We stipulate that different pedestrian identities are regarded as different pedestrian classes.
  • Figure 2: Overall pipeline of our strategies. Our strategies comprise 3 modules: (1) SVA: To backtrack and predict the pedestrians trajectory of tail classes. (2) DVA: To use diffusion model to change the background of the scene. (3) GS: To divide the pedestrians with different identities into unrelated groups and perform softmax operation on each group individually.
  • Figure 3: Illustration of the Stationary Camera View Data Augmentation (SVA).
  • Figure 4: Illustration of the Dynamic Camera View Data Augmentation (DVA).
  • Figure 5: Division of head classes and tail classes is based on the class average principle on the MOT17 validation set.
  • ...and 10 more figures