Table of Contents
Fetching ...

Watching Swarm Dynamics from Above: A Framework for Advanced Object Tracking in Drone Videos

Duc Pham, Matthew Hansen, Félicie Dhellemmes, Jens Krause, Pia Bideau

TL;DR

The paper tackles long-term tracking of collective animal behavior from moving drone footage in open environments where landmarks are scarce. It introduces SwDA, which fuses frame-level semantic segmentation with a particle-filter Bayesian tracker to recursively integrate observations $o_t$ and drone motion, yielding 2D swarm footprints and 3D world trajectories via $p = K [R|t] P$. Key contributions include a novel framework for world-coordinate swarm tracking in marine settings, a 40-minute drone video dataset with synchronized sensors and pixel-accurate masks, and comprehensive evaluations showing robustness in low-data regimes and accurate 3D localization. The work enables non-invasive, scalable study of open-ocean collective behavior and demonstrates how learning-based perception can be effectively integrated with classical state estimation for ecological research.

Abstract

Easily accessible sensors, like drones with diverse onboard sensors, have greatly expanded studying animal behavior in natural environments. Yet, analyzing vast, unlabeled video data, often spanning hours, remains a challenge for machine learning, especially in computer vision. Existing approaches often analyze only a few frames. Our focus is on long-term animal behavior analysis. To address this challenge, we utilize classical probabilistic methods for state estimation, such as particle filtering. By incorporating recent advancements in semantic object segmentation, we enable continuous tracking of rapidly evolving object formations, even in scenarios with limited data availability. Particle filters offer a provably optimal algorithmic structure for recursively adding new incoming information. We propose a novel approach for tracking schools of fish in the open ocean from drone videos. Our framework not only performs classical object tracking in 2D, instead it tracks the position and spatial expansion of the fish school in world coordinates by fusing video data and the drone's on board sensor information (GPS and IMU). The presented framework for the first time allows researchers to study collective behavior of fish schools in its natural social and environmental context in a non-invasive and scalable way.

Watching Swarm Dynamics from Above: A Framework for Advanced Object Tracking in Drone Videos

TL;DR

The paper tackles long-term tracking of collective animal behavior from moving drone footage in open environments where landmarks are scarce. It introduces SwDA, which fuses frame-level semantic segmentation with a particle-filter Bayesian tracker to recursively integrate observations and drone motion, yielding 2D swarm footprints and 3D world trajectories via . Key contributions include a novel framework for world-coordinate swarm tracking in marine settings, a 40-minute drone video dataset with synchronized sensors and pixel-accurate masks, and comprehensive evaluations showing robustness in low-data regimes and accurate 3D localization. The work enables non-invasive, scalable study of open-ocean collective behavior and demonstrates how learning-based perception can be effectively integrated with classical state estimation for ecological research.

Abstract

Easily accessible sensors, like drones with diverse onboard sensors, have greatly expanded studying animal behavior in natural environments. Yet, analyzing vast, unlabeled video data, often spanning hours, remains a challenge for machine learning, especially in computer vision. Existing approaches often analyze only a few frames. Our focus is on long-term animal behavior analysis. To address this challenge, we utilize classical probabilistic methods for state estimation, such as particle filtering. By incorporating recent advancements in semantic object segmentation, we enable continuous tracking of rapidly evolving object formations, even in scenarios with limited data availability. Particle filters offer a provably optimal algorithmic structure for recursively adding new incoming information. We propose a novel approach for tracking schools of fish in the open ocean from drone videos. Our framework not only performs classical object tracking in 2D, instead it tracks the position and spatial expansion of the fish school in world coordinates by fusing video data and the drone's on board sensor information (GPS and IMU). The presented framework for the first time allows researchers to study collective behavior of fish schools in its natural social and environmental context in a non-invasive and scalable way.
Paper Structure (9 sections, 2 equations, 4 figures, 4 tables)

This paper contains 9 sections, 2 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Swarm Dynamics from Above (SwDA), a framework for tracking collective behavior from drone videos. The recursive architecture of a particle filter, coupled with frame-by-frame semantic segmentation allows tracking over long time horizons.
  • Figure 2: Illustration of the motion model. Each particle is displaced by the induced motion vector due to the drone's movement
  • Figure 3: Tracking Accuracy. Evaluation of the Tracking Accuracy for different amount of labeled training data. Accuracy is measured via the successful detection rate (SDR). Results for two different precision ranges are shown: SDR within a radius of 30 pixels (blue), SDR within a radius of 20 pixels (purple).
  • Figure 4: Qualitative Results of three different videos are shown: (a) Original frame, (b) Swarm detection via particle tracking and (c) Global movement trajectory of the swarm, 10m/grid cell, color visualises time. The overall video length is $\sim$5min. Full videos are provided in the accompanying suppl. material.