Table of Contents
Fetching ...

A Hybrid Approach for Visual Multi-Object Tracking

Toan Van Nguyen, Rasmus G. K. Christiansen, Dirk Kraft, Leon Bodenhagen

TL;DR

The paper tackles visual multi-object tracking under nonlinear dynamics and unknown, time-varying target counts by proposing a hybrid framework that merges stochastic particle filtering with deterministic data association. It employs PSO-guided particle refinement with a fitness function combining history, exploration, and social cues, and uses a generalized cost matrix within Hungarian matching to robustly associate tracks to detections. A velocity regression scheme estimates trend-based velocities from history to stabilize state updates, particularly during occlusions, and weak tracks are updated using neighbors and PSO-based estimates to preserve identities. Evaluations on MOT17-04 demonstrate superior performance over state-of-the-art trackers, achieving robust identity maintenance with real-time feasibility on CPU, and the authors provide open-source reference implementations for reproducibility.

Abstract

This paper proposes a visual multi-object tracking method that jointly employs stochastic and deterministic mechanisms to ensure identifier consistency for unknown and time-varying target numbers under nonlinear dynamics. A stochastic particle filter addresses nonlinear dynamics and non-Gaussian noise, with support from particle swarm optimization (PSO) to guide particles toward state distribution modes and mitigate divergence through proposed fitness measures incorporating motion consistency, appearance similarity, and social-interaction cues with neighboring targets. Deterministic association further enforces identifier consistency via a proposed cost matrix incorporating spatial consistency between particles and current detections, detection confidences, and track penalties. Subsequently, a novel scheme is proposed for the smooth updating of target states while preserving their identities, particularly for weak tracks during interactions with other targets and prolonged occlusions. Moreover, velocity regression over past states provides trend-seed velocities, enhancing particle sampling and state updates. The proposed tracker is designed to operate flexibly for both pre-recorded videos and camera live streams, where future frames are unavailable. Experimental results confirm superior performance compared to state-of-the-art trackers. The source-code reference implementations of both the proposed method and compared-trackers are provided on GitHub: https://github.com/SDU-VelKoTek/GenTrack2

A Hybrid Approach for Visual Multi-Object Tracking

TL;DR

The paper tackles visual multi-object tracking under nonlinear dynamics and unknown, time-varying target counts by proposing a hybrid framework that merges stochastic particle filtering with deterministic data association. It employs PSO-guided particle refinement with a fitness function combining history, exploration, and social cues, and uses a generalized cost matrix within Hungarian matching to robustly associate tracks to detections. A velocity regression scheme estimates trend-based velocities from history to stabilize state updates, particularly during occlusions, and weak tracks are updated using neighbors and PSO-based estimates to preserve identities. Evaluations on MOT17-04 demonstrate superior performance over state-of-the-art trackers, achieving robust identity maintenance with real-time feasibility on CPU, and the authors provide open-source reference implementations for reproducibility.

Abstract

This paper proposes a visual multi-object tracking method that jointly employs stochastic and deterministic mechanisms to ensure identifier consistency for unknown and time-varying target numbers under nonlinear dynamics. A stochastic particle filter addresses nonlinear dynamics and non-Gaussian noise, with support from particle swarm optimization (PSO) to guide particles toward state distribution modes and mitigate divergence through proposed fitness measures incorporating motion consistency, appearance similarity, and social-interaction cues with neighboring targets. Deterministic association further enforces identifier consistency via a proposed cost matrix incorporating spatial consistency between particles and current detections, detection confidences, and track penalties. Subsequently, a novel scheme is proposed for the smooth updating of target states while preserving their identities, particularly for weak tracks during interactions with other targets and prolonged occlusions. Moreover, velocity regression over past states provides trend-seed velocities, enhancing particle sampling and state updates. The proposed tracker is designed to operate flexibly for both pre-recorded videos and camera live streams, where future frames are unavailable. Experimental results confirm superior performance compared to state-of-the-art trackers. The source-code reference implementations of both the proposed method and compared-trackers are provided on GitHub: https://github.com/SDU-VelKoTek/GenTrack2

Paper Structure

This paper contains 11 sections, 16 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Overview of the proposed method pipeline. The inputs of the tracking system include current image, detections and associated confidences, while previous target IDs, states, velocities, track penalties, and ages are updated internally.
  • Figure 2: An illustration of visual human tracking, with particle visualization.
  • Figure 3: ATA-IDF1-HOTA comparisons of trackers on human tracking.
  • Figure 4: Examples of target interactions during multi-object tracking. Here, green bounding boxes denote strong tracks, red indicate weak tracks, and bold mark observed targets. The first row shows the target ID 14 from frame 215 to 275, with occlusion between frames 231-249. The second row shows the interaction between target ID 34 and neighbours from frame 1 to 135. The last row shows the interaction of target ID 16 from frame 1 to 110, with occlusion between frames 20-95. Target IDs 14 and 34 cross other targets, while target ID 16 moves alongside a neighbour. Target states are updated smoothly, and their IDs are consistently maintained throughout interactions, including during prolonged occlusions.