Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

Gaoang Wang; Mingli Song; Jenq-Neng Hwang

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

Gaoang Wang, Mingli Song, Jenq-Neng Hwang

TL;DR

This survey conducts a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives, including patch-level embedding, single-frameembedding, cross-frame joint embeddedding, correlation embedding), sequential embedding; and cross-track relational embedding.

Abstract

Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories. With the advancement of deep neural networks and the increasing demand for intelligent video analysis, MOT has gained significantly increased interest in the computer vision community. Embedding methods play an essential role in object location estimation and temporal identity association in MOT. Unlike other computer vision tasks, such as image classification, object detection, re-identification, and segmentation, embedding methods in MOT have large variations, and they have never been systematically analyzed and summarized. In this survey, we first conduct a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives, including patch-level embedding, single-frame embedding, cross-frame joint embedding, correlation embedding, sequential embedding, tracklet embedding, and cross-track relational embedding. We further summarize the existing widely used MOT datasets and analyze the advantages of existing state-of-the-art methods according to their embedding strategies. Finally, some critical yet under-investigated areas and future research directions are discussed.

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

TL;DR

Abstract

Paper Structure (37 sections, 12 equations, 7 figures, 4 tables)

This paper contains 37 sections, 12 equations, 7 figures, 4 tables.

Introduction
MOT Related Tasks
Single Object Tracking
Video Object Detection
Re-Identification
A taxonomy of MOT Embedding Methods
Patch-Level Box Image Embedding
Self-Embedding
Pairwise Embedding
Single-Frame Detection Embedding
Cross-Frame Joint Embedding
Multi-Frame Spatial-Temporal Embedding
Head-Level Feature Aggregated Embedding
Correlation-Based Embedding
Sequential Embedding
...and 22 more sections

Figures (7)

Figure 1: A taxonomy of MOT embedding methods. The green and red boxes show the categories and representative references, respectively.
Figure 2: Pairwise patch-level embedding. $\bigoplus$ represents the concatenation.
Figure 3: Single-frame joint detection embedding.
Figure 4: Multi-frame spatial-temporal embedding.
Figure 5: Head-level aggregated embedding.
...and 2 more figures

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

TL;DR

Abstract

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (7)