Table of Contents
Fetching ...

BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision

Xin Zhao, Shiyu Hu, Yipei Wang, Jing Zhang, Yimin Hu, Rongshuai Liu, Haibin Ling, Yin Li, Renshu Li, Kun Liu, Jiadong Li

TL;DR

BioDrone tackles robust visual tracking under UAV flight by introducing the first bionic drone-based SOT benchmark collected with a flapping-wing UAV, emphasizing tiny targets and drastic frame changes. It provides 600 videos with 304,209 manually labeled frames and ten frame-level attributes, plus an evaluation framework and baselines. The authors adapt KeepTrack into UAV-KT and show a measurable precision boost (about 5%) over strong baselines, demonstrating improved robustness for tiny-target and fast-motion scenarios. This benchmark and the proposed baselines offer a rigorous platform for advancing robust vision in UAV contexts and motivate future work in high-speed, aerial, and egocentric tracking tasks.

Abstract

Single object tracking (SOT) is a fundamental problem in computer vision, with a wide range of applications, including autonomous driving, augmented reality, and robot navigation. The robustness of SOT faces two main challenges: tiny target and fast motion. These challenges are especially manifested in videos captured by unmanned aerial vehicles (UAV), where the target is usually far away from the camera and often with significant motion relative to the camera. To evaluate the robustness of SOT methods, we propose BioDrone -- the first bionic drone-based visual benchmark for SOT. Unlike existing UAV datasets, BioDrone features videos captured from a flapping-wing UAV system with a major camera shake due to its aerodynamics. BioDrone hence highlights the tracking of tiny targets with drastic changes between consecutive frames, providing a new robust vision benchmark for SOT. To date, BioDrone offers the largest UAV-based SOT benchmark with high-quality fine-grained manual annotations and automatically generates frame-level labels, designed for robust vision analyses. Leveraging our proposed BioDrone, we conduct a systematic evaluation of existing SOT methods, comparing the performance of 20 representative models and studying novel means of optimizing a SOTA method (KeepTrack KeepTrack) for robust SOT. Our evaluation leads to new baselines and insights for robust SOT. Moving forward, we hope that BioDrone will not only serve as a high-quality benchmark for robust SOT, but also invite future research into robust computer vision. The database, toolkits, evaluation server, and baseline results are available at http://biodrone.aitestunion.com.

BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision

TL;DR

BioDrone tackles robust visual tracking under UAV flight by introducing the first bionic drone-based SOT benchmark collected with a flapping-wing UAV, emphasizing tiny targets and drastic frame changes. It provides 600 videos with 304,209 manually labeled frames and ten frame-level attributes, plus an evaluation framework and baselines. The authors adapt KeepTrack into UAV-KT and show a measurable precision boost (about 5%) over strong baselines, demonstrating improved robustness for tiny-target and fast-motion scenarios. This benchmark and the proposed baselines offer a rigorous platform for advancing robust vision in UAV contexts and motivate future work in high-speed, aerial, and egocentric tracking tasks.

Abstract

Single object tracking (SOT) is a fundamental problem in computer vision, with a wide range of applications, including autonomous driving, augmented reality, and robot navigation. The robustness of SOT faces two main challenges: tiny target and fast motion. These challenges are especially manifested in videos captured by unmanned aerial vehicles (UAV), where the target is usually far away from the camera and often with significant motion relative to the camera. To evaluate the robustness of SOT methods, we propose BioDrone -- the first bionic drone-based visual benchmark for SOT. Unlike existing UAV datasets, BioDrone features videos captured from a flapping-wing UAV system with a major camera shake due to its aerodynamics. BioDrone hence highlights the tracking of tiny targets with drastic changes between consecutive frames, providing a new robust vision benchmark for SOT. To date, BioDrone offers the largest UAV-based SOT benchmark with high-quality fine-grained manual annotations and automatically generates frame-level labels, designed for robust vision analyses. Leveraging our proposed BioDrone, we conduct a systematic evaluation of existing SOT methods, comparing the performance of 20 representative models and studying novel means of optimizing a SOTA method (KeepTrack KeepTrack) for robust SOT. Our evaluation leads to new baselines and insights for robust SOT. Moving forward, we hope that BioDrone will not only serve as a high-quality benchmark for robust SOT, but also invite future research into robust computer vision. The database, toolkits, evaluation server, and baseline results are available at http://biodrone.aitestunion.com.
Paper Structure (28 sections, 4 equations, 16 figures, 2 tables, 1 algorithm)

This paper contains 28 sections, 4 equations, 16 figures, 2 tables, 1 algorithm.

Figures (16)

  • Figure 1: This paper aims to study the robust vision problem in visual object tracking; thus, we propose a bionic drone-based SOT benchmark named BioDrone to support this goal. In this figure, we compare BioDrone (G to J) with generic SOT benchmarks represented by VOT short-term tracking competition VOT2018VOT2019 (A to B), LaSOT LaSOT (C to D), VideoCube GIT (E to F). Here we select the same object categories (car and person) in different benchmarks, and add performances of state-of-the-art (SOTA) tracking methods for better comparison ($\blacksquare$ green bounding-box represents ground-truth, $\blacksquare$ yellow bounding-box represents KeepTrack KeepTrack, $\blacksquare$ blue bounding-box represents MixFormer MixFormer, $\blacksquare$ red bounding-box represents SiamRCNN SiamRCNN). Compared to other benchmarks, BioDrone highlights the challenges of tiny target and fast motion. The above factors can affect appearance and motion information, bringing troubles to most tracking algorithms on BioDrone. Most SOTA methods lose the target after tens of frames on BioDrone, but they perform well for thousands of frames on other benchmarks.
  • Figure 2: Summary of existing SOT benchmarks, including classical benchmarks (OTB100 OTB2015, VOT2016 VOT2016, VOT2018 VOT2018, VOT2019 VOT2019, GOT-10kGOT-10k, VOTLT2019 VOT2019, LaSOT LaSOT, Videocube GIT), and UAV-based benchmarks (UAV123 UAV123, UAVDT UAVDT, DTB70 DTB70, VisDrone VisDrone). The bubble diameter is in proportion to the total frames of a benchmark. The bubbles with dashed borders represent UAV-based benchmarks. The horizontal coordinate represents the average relative scale of the target, and the vertical coordinate represents the average correlation coefficient between consecutive frames. The proposed BioDrone has a smaller target size and more drastic frame changes between consecutive frames, with higher demands on the robustness of tracking algorithms.
  • Figure 3: Example of typical UAVs. Compared to the other two types of UAVs, flapping-wing UAVs include more challenges due to their bionic mechanical structure.
  • Figure 4: Summary of existing UAV-based datasets and generic SOT datasets (1k=$10^3$, 1m=$10^6$). To our knowledge, BioDrone is the first SOT benchmark collected by the bionic-based vision system and the largest UAV-based SOT benchmark.
  • Figure 5: Illustrations of the flapping-wing UAV used for data collection and the representative data of BioDrone. Different flight attitudes for various scenes under three lighting conditions are included in the data acquisition process, ensuring that BioDrone can fully reflect the robust visual challenges of the flapping-wing UAVs.
  • ...and 11 more figures