Table of Contents
Fetching ...

Benchmarking EfficientTAM on FMO datasets

Senem Aktas, Charles Markham, John McDonald, Rozenn Dahyot

TL;DR

The paper tackles the challenge of tracking fast-moving, small objects by creating FMOX, a JSON-based extension that unifies four public FMO datasets and adds size-aware annotations. It then benchmarks EfficientTAM on FMOX without additional training, using TIoU as the evaluation metric and a first-frame bounding-box initialization. Results show competitive TIoU on Falling Objects and TbD-3D, while revealing limitations with motion blur and multi-instance scenarios, and underscore the importance of initialization strategy. The work provides open-source data and tooling to standardize FMO research and motivates future work on size-aware metrics and motion-related effects.

Abstract

Fast and tiny object tracking remains a challenge in computer vision and in this paper we first introduce a JSON metadata file associated with four open source datasets of Fast Moving Objects (FMOs) image sequences. In addition, we extend the description of the FMOs datasets with additional ground truth information in JSON format (called FMOX) with object size information. Finally we use our FMOX file to test a recently proposed foundational model for tracking (called EfficientTAM) showing that its performance compares well with the pipelines originally taylored for these FMO datasets. Our comparison of these state-of-the-art techniques on FMOX is provided with Trajectory Intersection of Union (TIoU) scores. The code and JSON is shared open source allowing FMOX to be accessible and usable for other machine learning pipelines aiming to process FMO datasets.

Benchmarking EfficientTAM on FMO datasets

TL;DR

The paper tackles the challenge of tracking fast-moving, small objects by creating FMOX, a JSON-based extension that unifies four public FMO datasets and adds size-aware annotations. It then benchmarks EfficientTAM on FMOX without additional training, using TIoU as the evaluation metric and a first-frame bounding-box initialization. Results show competitive TIoU on Falling Objects and TbD-3D, while revealing limitations with motion blur and multi-instance scenarios, and underscore the importance of initialization strategy. The work provides open-source data and tooling to standardize FMO research and motivates future work on size-aware metrics and motion-related effects.

Abstract

Fast and tiny object tracking remains a challenge in computer vision and in this paper we first introduce a JSON metadata file associated with four open source datasets of Fast Moving Objects (FMOs) image sequences. In addition, we extend the description of the FMOs datasets with additional ground truth information in JSON format (called FMOX) with object size information. Finally we use our FMOX file to test a recently proposed foundational model for tracking (called EfficientTAM) showing that its performance compares well with the pipelines originally taylored for these FMO datasets. Our comparison of these state-of-the-art techniques on FMOX is provided with Trajectory Intersection of Union (TIoU) scores. The code and JSON is shared open source allowing FMOX to be accessible and usable for other machine learning pipelines aiming to process FMO datasets.

Paper Structure

This paper contains 12 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Falling Objects kotera2020restoration samples.
  • Figure 2: FMOv2 rozumnyi2017world samples.
  • Figure 3: TbD kotera2019intra samples.
  • Figure 4: TbD-3D rozumnyi2020sub samples.
  • Figure 5: EfficientTAM estimated trajectories on TbD-3D dataset. Green color indicates ground truth trajectory while red color for EfficientTAM estimated trajectory. TIoU values are above 0.81 for all sequences. Objects (mostly balls) quite big, while having motion blur still object is pretty visible all along sequences.