Table of Contents
Fetching ...

Omnidirectional Multi-Object Tracking

Kai Luo, Hao Shi, Sheng Wu, Fei Teng, Mengfei Duan, Chang Huang, Yuhang Wang, Kaiwei Wang, Kailun Yang

TL;DR

OmniTrack introduces a panoramic multi-object tracking framework that bridges Tracking-By-Detection and End-To-End paradigms through a feedback-driven architecture. Central components—Tracklets Management, FlexiTrack Instance, and CircularStatE—mitigate panoramic distortions, leverage temporal context, and reduce uncertainty in 360° scenes. The QuadTrack dataset offers a challenging 360° FoV benchmark captured on a quadruped robot, enabling robust evaluation under dynamic, non-linear sensor motion. Experiments on JRDB and QuadTrack demonstrate state-of-the-art performance across HOTA and IDF1 metrics, validating OmniTrack’s effectiveness for omnidirectional perception in robotics. The work provides code and dataset resources to advance panoramic MOT research and practice.

Abstract

Panoramic imagery, with its 360° field of view, offers comprehensive information to support Multi-Object Tracking (MOT) in capturing spatial and temporal relationships of surrounding objects. However, most MOT algorithms are tailored for pinhole images with limited views, impairing their effectiveness in panoramic settings. Additionally, panoramic image distortions, such as resolution loss, geometric deformation, and uneven lighting, hinder direct adaptation of existing MOT methods, leading to significant performance degradation. To address these challenges, we propose OmniTrack, an omnidirectional MOT framework that incorporates Tracklet Management to introduce temporal cues, FlexiTrack Instances for object localization and association, and the CircularStatE Module to alleviate image and geometric distortions. This integration enables tracking in panoramic field-of-view scenarios, even under rapid sensor motion. To mitigate the lack of panoramic MOT datasets, we introduce the QuadTrack dataset--a comprehensive panoramic dataset collected by a quadruped robot, featuring diverse challenges such as panoramic fields of view, intense motion, and complex environments. Extensive experiments on the public JRDB dataset and the newly introduced QuadTrack benchmark demonstrate the state-of-the-art performance of the proposed framework. OmniTrack achieves a HOTA score of 26.92% on JRDB, representing an improvement of 3.43%, and further achieves 23.45% on QuadTrack, surpassing the baseline by 6.81%. The established dataset and source code are available at https://github.com/xifen523/OmniTrack.

Omnidirectional Multi-Object Tracking

TL;DR

OmniTrack introduces a panoramic multi-object tracking framework that bridges Tracking-By-Detection and End-To-End paradigms through a feedback-driven architecture. Central components—Tracklets Management, FlexiTrack Instance, and CircularStatE—mitigate panoramic distortions, leverage temporal context, and reduce uncertainty in 360° scenes. The QuadTrack dataset offers a challenging 360° FoV benchmark captured on a quadruped robot, enabling robust evaluation under dynamic, non-linear sensor motion. Experiments on JRDB and QuadTrack demonstrate state-of-the-art performance across HOTA and IDF1 metrics, validating OmniTrack’s effectiveness for omnidirectional perception in robotics. The work provides code and dataset resources to advance panoramic MOT research and practice.

Abstract

Panoramic imagery, with its 360° field of view, offers comprehensive information to support Multi-Object Tracking (MOT) in capturing spatial and temporal relationships of surrounding objects. However, most MOT algorithms are tailored for pinhole images with limited views, impairing their effectiveness in panoramic settings. Additionally, panoramic image distortions, such as resolution loss, geometric deformation, and uneven lighting, hinder direct adaptation of existing MOT methods, leading to significant performance degradation. To address these challenges, we propose OmniTrack, an omnidirectional MOT framework that incorporates Tracklet Management to introduce temporal cues, FlexiTrack Instances for object localization and association, and the CircularStatE Module to alleviate image and geometric distortions. This integration enables tracking in panoramic field-of-view scenarios, even under rapid sensor motion. To mitigate the lack of panoramic MOT datasets, we introduce the QuadTrack dataset--a comprehensive panoramic dataset collected by a quadruped robot, featuring diverse challenges such as panoramic fields of view, intense motion, and complex environments. Extensive experiments on the public JRDB dataset and the newly introduced QuadTrack benchmark demonstrate the state-of-the-art performance of the proposed framework. OmniTrack achieves a HOTA score of 26.92% on JRDB, representing an improvement of 3.43%, and further achieves 23.45% on QuadTrack, surpassing the baseline by 6.81%. The established dataset and source code are available at https://github.com/xifen523/OmniTrack.

Paper Structure

This paper contains 41 sections, 11 equations, 10 figures, 12 tables, 1 algorithm.

Figures (10)

  • Figure 1: The proposed OmniTrack pipeline. CSEM refers to the CircularStatE Module \ref{['subsec:CSEM']} , DA stands for data association, E2E denotes the End-to-End tracking paradigm, TBD refers to the Track-By-Detection tracking paradigm, Upd refers to updating tracks, Init to initializing tracks, and Del to deleting tracks.
  • Figure 2: The proposed CircularStatE Module fuses multi-scale features to generate learnable instances. The DynamicSSM Block mitigates distortions in panoramic-FoV images, enhancing feature stability across uneven lighting and color distributions.
  • Figure 3: (a) shows the bounding box (bbox) size distribution for the training and validation sets, whereas (b) depicts the data collection platform and panoramic camera setup. (c) and (d) compare the normalized Y-axis pixel positions of trajectories between the QuadTrack () and JRDB martin2021jrdb () datasets, illustrating the significant vertical motion of the sensor in QuadTrack.
  • Figure 4: Effects of the trajectory initialization threshold and update threshold on the HOTA metric in OmniTrack$_{E2E}$.
  • Figure 5: Comparison of state-of-the-art methods on different datasets. Pinhole refers to Multi-Object Tracking (MOT) datasets that utilize pinhole camera images, whereas Panorama refers to MOT datasets that employ panoramic images.
  • ...and 5 more figures