Table of Contents
Fetching ...

A Multilevel Strategy to Improve People Tracking in a Real-World Scenario

Cristiano B. de Oliveira, Joao C. Neves, Rafael O. Ribeiro, David Menotti

TL;DR

The paper addresses the challenge of robust people tracking in real-world surveillance, where ID consistency is frequently broken by occlusions and fragmentation. It introduces WindowTracker, a multilevel two-stage pairing that combines a first-pass tracker (L1) with a high-confidence secondary pass (L2) to correct ID associations by reselecting top detections within a sliding window and reconciling IDs via IoU-based assignment. The study contributes the UFPR-Planalto801 dataset and a comprehensive evaluation across twelve L1/L2 configurations, showing up to a 9.5 percentage-point improvement in IDF1 and notable gains in MOTA and HOTA, thereby demonstrating improved ID retention in challenging real-world footage. The approach offers a scalable, detector-agnostic framework for enhancing tracking reliability in security and forensics contexts, with future work aimed at dataset expansion and deeper feature-based L2 trackers.

Abstract

The Palácio do Planalto, office of the President of Brazil, was invaded by protesters on January 8, 2023. Surveillance videos taken from inside the building were subsequently released by the Brazilian Supreme Court for public scrutiny. We used segments of such footage to create the UFPR-Planalto801 dataset for people tracking and re-identification in a real-world scenario. This dataset consists of more than 500,000 images. This paper presents a tracking approach targeting this dataset. The method proposed in this paper relies on the use of known state-of-the-art trackers combined in a multilevel hierarchy to correct the ID association over the trajectories. We evaluated our method using IDF1, MOTA, MOTP and HOTA metrics. The results show improvements for every tracker used in the experiments, with IDF1 score increasing by a margin up to 9.5%.

A Multilevel Strategy to Improve People Tracking in a Real-World Scenario

TL;DR

The paper addresses the challenge of robust people tracking in real-world surveillance, where ID consistency is frequently broken by occlusions and fragmentation. It introduces WindowTracker, a multilevel two-stage pairing that combines a first-pass tracker (L1) with a high-confidence secondary pass (L2) to correct ID associations by reselecting top detections within a sliding window and reconciling IDs via IoU-based assignment. The study contributes the UFPR-Planalto801 dataset and a comprehensive evaluation across twelve L1/L2 configurations, showing up to a 9.5 percentage-point improvement in IDF1 and notable gains in MOTA and HOTA, thereby demonstrating improved ID retention in challenging real-world footage. The approach offers a scalable, detector-agnostic framework for enhancing tracking reliability in security and forensics contexts, with future work aimed at dataset expansion and deeper feature-based L2 trackers.

Abstract

The Palácio do Planalto, office of the President of Brazil, was invaded by protesters on January 8, 2023. Surveillance videos taken from inside the building were subsequently released by the Brazilian Supreme Court for public scrutiny. We used segments of such footage to create the UFPR-Planalto801 dataset for people tracking and re-identification in a real-world scenario. This dataset consists of more than 500,000 images. This paper presents a tracking approach targeting this dataset. The method proposed in this paper relies on the use of known state-of-the-art trackers combined in a multilevel hierarchy to correct the ID association over the trajectories. We evaluated our method using IDF1, MOTA, MOTP and HOTA metrics. The results show improvements for every tracker used in the experiments, with IDF1 score increasing by a margin up to 9.5%.
Paper Structure (9 sections, 4 equations, 4 figures, 7 tables)

This paper contains 9 sections, 4 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Examples of scenes captured on footage.
  • Figure 2: Framing of people in the proposed dataset: (a) occluded; (b) full body; (c) upper body; and (d) head only.
  • Figure 3: WindowTracker (proposal) approach overview.
  • Figure 4: Example of ID correction using WindowTracker.