Table of Contents
Fetching ...

DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects

Peng Wang, Yongcai Wang, Deying Li

TL;DR

MOT on moving drones faces unique challenges from small, blurred, occluded objects and simultaneous drone-object motion. The paper proposes DroneMOT, which combines a Dual-Domain Integrated Attention (DIA) module for improved detection and embeddings with a Motion-Driven Association (MDA) module for robust data association; MDA includes Adaptive Feature Synchronization (AFS) and Dual Motion-based Prediction (DMP) that accounts for drone hovering, translation, and rotation. Data association uses a cost function that fuses IoU-based localization, appearance similarity, and rotation cues, expressed as $C = I_C + w_a A_C + w_r R_C$, and solved via Hungarian assignment to produce trajectories. Experiments on VisDrone2019-MOT and UAVDT demonstrate state-of-the-art performance in IDF1 and ID switches, validating the method's effectiveness for drone-based MOT and its potential for practical deployment in aerial surveillance and autonomous flight.

Abstract

Multi-object tracking (MOT) on static platforms, such as by surveillance cameras, has achieved significant progress, with various paradigms providing attractive performances. However, the effectiveness of traditional MOT methods is significantly reduced when it comes to dynamic platforms like drones. This decrease is attributed to the distinctive challenges in the MOT-on-drone scenario: (1) objects are generally small in the image plane, blurred, and frequently occluded, making them challenging to detect and recognize; (2) drones move and see objects from different angles, causing the unreliability of the predicted positions and feature embeddings of the objects. This paper proposes DroneMOT, which firstly proposes a Dual-domain Integrated Attention (DIA) module that considers the fast movements of drones to enhance the drone-based object detection and feature embedding for small-sized, blurred, and occluded objects. Then, an innovative Motion-Driven Association (MDA) scheme is introduced, considering the concurrent movements of both the drone and the objects. Within MDA, an Adaptive Feature Synchronization (AFS) technique is presented to update the object features seen from different angles. Additionally, a Dual Motion-based Prediction (DMP) method is employed to forecast the object positions. Finally, both the refined feature embeddings and the predicted positions are integrated to enhance the object association. Comprehensive evaluations on VisDrone2019-MOT and UAVDT datasets show that DroneMOT provides substantial performance improvements over the state-of-the-art in the domain of MOT on drones.

DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects

TL;DR

MOT on moving drones faces unique challenges from small, blurred, occluded objects and simultaneous drone-object motion. The paper proposes DroneMOT, which combines a Dual-Domain Integrated Attention (DIA) module for improved detection and embeddings with a Motion-Driven Association (MDA) module for robust data association; MDA includes Adaptive Feature Synchronization (AFS) and Dual Motion-based Prediction (DMP) that accounts for drone hovering, translation, and rotation. Data association uses a cost function that fuses IoU-based localization, appearance similarity, and rotation cues, expressed as , and solved via Hungarian assignment to produce trajectories. Experiments on VisDrone2019-MOT and UAVDT demonstrate state-of-the-art performance in IDF1 and ID switches, validating the method's effectiveness for drone-based MOT and its potential for practical deployment in aerial surveillance and autonomous flight.

Abstract

Multi-object tracking (MOT) on static platforms, such as by surveillance cameras, has achieved significant progress, with various paradigms providing attractive performances. However, the effectiveness of traditional MOT methods is significantly reduced when it comes to dynamic platforms like drones. This decrease is attributed to the distinctive challenges in the MOT-on-drone scenario: (1) objects are generally small in the image plane, blurred, and frequently occluded, making them challenging to detect and recognize; (2) drones move and see objects from different angles, causing the unreliability of the predicted positions and feature embeddings of the objects. This paper proposes DroneMOT, which firstly proposes a Dual-domain Integrated Attention (DIA) module that considers the fast movements of drones to enhance the drone-based object detection and feature embedding for small-sized, blurred, and occluded objects. Then, an innovative Motion-Driven Association (MDA) scheme is introduced, considering the concurrent movements of both the drone and the objects. Within MDA, an Adaptive Feature Synchronization (AFS) technique is presented to update the object features seen from different angles. Additionally, a Dual Motion-based Prediction (DMP) method is employed to forecast the object positions. Finally, both the refined feature embeddings and the predicted positions are integrated to enhance the object association. Comprehensive evaluations on VisDrone2019-MOT and UAVDT datasets show that DroneMOT provides substantial performance improvements over the state-of-the-art in the domain of MOT on drones.
Paper Structure (11 sections, 6 equations, 6 figures, 3 tables)

This paper contains 11 sections, 6 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Challenges of MOT on drones. (a) comparisons between conventional MOT datasets(MOT17/20) and drone-based MOT datasets(Visdrone2019-MOT and UAVDT). The x-axis represents the average change in the pixel position of the same object in adjacent frames. In contrast, the y-axis represents the coefficient of variation (variance/mean) of the object's bbox size. (b) Visualization of these challenges, encompassing small-scale objects, large pixel offsets, and varying angle views.
  • Figure 2: The overall architecture of DroneMOT. It primarily consists of two modules: the network module (\ref{['NM']}) for online detection and feature embedding and the data-association module (\ref{['MDA']}) to associate detections with stored trajectories of objects.
  • Figure 3: Structure of Heatmap-Guided Temporal Attention.
  • Figure 4: Feature map comparison between without DIA and with DIA.
  • Figure 5: Visualization of tracking results on the Visdrone2019-MOT dataset when the drone is rotating rapidly.
  • ...and 1 more figures