YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers
Sourav Raxit, Simant Bahadur Singh, Abdullah Al Redwan Newaz
TL;DR
This paper tackles GPS-denied UAV localization by leveraging fiducial markers detected with a lightweight YOLOv8-based detector and fused through an efficient EPnP-based 4D pose estimator. To address noisy pose outputs, it introduces a Butterworth low-pass filter that smooths trajectories while balancing delay, validated in indoor real-robot experiments. Key contributions include (i) fast multi-marker detection with YOLOv8, (ii) robust pose estimation from multiple landmarks, (iii) noise reduction via a higher-order Butterworth filter, and (iv) comprehensive indoor benchmarks showing real-time performance (55 FPS) and improved trajectory accuracy over Apriltag and DeepTag. The work demonstrates a practical, real-time fiducial-marker localization system for GPS-denied UAV navigation and points to future object-based localization to broaden applicability.
Abstract
By harnessing fiducial markers as visual landmarks in the environment, Unmanned Aerial Vehicles (UAVs) can rapidly build precise maps and navigate spaces safely and efficiently, unlocking their potential for fluent collaboration and coexistence with humans. Existing fiducial marker methods rely on handcrafted feature extraction, which sacrifices accuracy. On the other hand, deep learning pipelines for marker detection fail to meet real-time runtime constraints crucial for navigation applications. In this work, we propose YoloTag -a real-time fiducial marker-based localization system. YoloTag uses a lightweight YOLO v8 object detector to accurately detect fiducial markers in images while meeting the runtime constraints needed for navigation. The detected markers are then used by an efficient perspective-n-point algorithm to estimate UAV states. However, this localization system introduces noise, causing instability in trajectory tracking. To suppress noise, we design a higher-order Butterworth filter that effectively eliminates noise through frequency domain analysis. We evaluate our algorithm through real-robot experiments in an indoor environment, comparing the trajectory tracking performance of our method against other approaches in terms of several distance metrics.
