Table of Contents
Fetching ...

SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

Jiawen Wen, Yu Hu, Suixuan Qiu, Jinshan Huang, Xiaowen Chu

TL;DR

Real-time tracking of small UAVs on edge devices is hindered by a resolution-speed conflict between high-resolution imagery and limited processing power. SDG-Track addresses this with an Observer-Follower architecture: a high-capacity detector operates at low frequency on GPU to provide absolute anchors from 1080p frames, while a CPU-based ROI-constrained sparse optical-flow module interpolates trajectories at high frequency. A training-free Dual-Space Recovery mechanism fuses Lab and HSV color cues with geometric constraints to re-acquire targets after occlusion or drift. Experiments on a Jetson Orin Nano show 35.1 FPS system throughput while preserving 97.2% of detector precision, demonstrating robust, deployable edge-based UAV tracking for real-world gimbal control. The approach enables accurate, real-time tracking of agile drones in challenging environments with limited onboard compute.

Abstract

Real-time tracking of small unmanned aerial vehicles (UAVs) on edge devices faces a fundamental resolution-speed conflict. Downsampling high-resolution imagery to standard detector input sizes causes small target features to collapse below detectable thresholds. Yet processing native 1080p frames on resource-constrained platforms yields insufficient throughput for smooth gimbal control. We propose SDG-Track, a Sparse Detection-Guided Tracker that adopts an Observer-Follower architecture to reconcile this conflict. The Observer stream runs a high-capacity detector at low frequency on the GPU to provide accurate position anchors from 1920x1080 frames. The Follower stream performs high-frequency trajectory interpolation via ROI-constrained sparse optical flow on the CPU. To handle tracking failures from occlusion or model drift caused by spectrally similar distractors, we introduce Dual-Space Recovery, a training-free re-acquisition mechanism combining color histogram matching with geometric consistency constraints. Experiments on a ground-to-air tracking station demonstrate that SDG-Track achieves 35.1 FPS system throughput while retaining 97.2\% of the frame-by-frame detection precision. The system successfully tracks agile FPV drones under real-world operational conditions on an NVIDIA Jetson Orin Nano. Our paper code is publicly available at https://github.com/Jeffry-wen/SDG-Track

SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

TL;DR

Real-time tracking of small UAVs on edge devices is hindered by a resolution-speed conflict between high-resolution imagery and limited processing power. SDG-Track addresses this with an Observer-Follower architecture: a high-capacity detector operates at low frequency on GPU to provide absolute anchors from 1080p frames, while a CPU-based ROI-constrained sparse optical-flow module interpolates trajectories at high frequency. A training-free Dual-Space Recovery mechanism fuses Lab and HSV color cues with geometric constraints to re-acquire targets after occlusion or drift. Experiments on a Jetson Orin Nano show 35.1 FPS system throughput while preserving 97.2% of detector precision, demonstrating robust, deployable edge-based UAV tracking for real-world gimbal control. The approach enables accurate, real-time tracking of agile drones in challenging environments with limited onboard compute.

Abstract

Real-time tracking of small unmanned aerial vehicles (UAVs) on edge devices faces a fundamental resolution-speed conflict. Downsampling high-resolution imagery to standard detector input sizes causes small target features to collapse below detectable thresholds. Yet processing native 1080p frames on resource-constrained platforms yields insufficient throughput for smooth gimbal control. We propose SDG-Track, a Sparse Detection-Guided Tracker that adopts an Observer-Follower architecture to reconcile this conflict. The Observer stream runs a high-capacity detector at low frequency on the GPU to provide accurate position anchors from 1920x1080 frames. The Follower stream performs high-frequency trajectory interpolation via ROI-constrained sparse optical flow on the CPU. To handle tracking failures from occlusion or model drift caused by spectrally similar distractors, we introduce Dual-Space Recovery, a training-free re-acquisition mechanism combining color histogram matching with geometric consistency constraints. Experiments on a ground-to-air tracking station demonstrate that SDG-Track achieves 35.1 FPS system throughput while retaining 97.2\% of the frame-by-frame detection precision. The system successfully tracks agile FPV drones under real-world operational conditions on an NVIDIA Jetson Orin Nano. Our paper code is publicly available at https://github.com/Jeffry-wen/SDG-Track

Paper Structure

This paper contains 21 sections, 1 equation, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The inherent trade-off between detection resolution and inference speed on edge devices. (Top) Aggressive downsampling to $640 \times 640$ enables real-time performance ($50$ FPS) but results in the loss of critical spatial details, leading to detection failure. (Bottom) While maintaining the original $1920 \times 1080$ resolution allows the detector to resolve the small UAV target ($12 \times 10$ pixels), the computational cost reduces the frame rate to $\sim 7$ FPS, failing to meet the requirements for smooth mechanical tracking.
  • Figure 2: Overview of the SDG-Track framework. The architecture consists of three logical blocks: (1) An Observer Stream utilizing YOLO11-l (GPU) for periodic trajectory correction; (2) A Follower Stream using Optical Flow (CPU) for high-frequency tracking; and (3) A Dual-Space Recovery Module that leverages color and geometric constraints to handle target loss.
  • Figure 3: Workflow of the Follower Stream. The system enhances Pyramidal LK tracking with Median Flow filtering, Drift Correction, and Lazy Template Update. A Kalman Filter smooths the final trajectory for jitter-free control.
  • Figure 4: Workflow of the Dual-Space Recovery Module. The module fuses HSV and Lab probability maps using adaptive weights ($\alpha, \beta$). A binary mask is generated via OTSU segmentation. Finally, the Geometric Safety Valve filters candidates based on a composite score of five constraints: Color, HSV, Size, Previous Position, and Shape, ensuring robust re-capture.
  • Figure 5: Hardware setup of the G2A tracking station. The Jetson Orin Nano processes $1080p$ streams from the PTZ camera and generates real-time control commands to track agile FPV drones.