Table of Contents
Fetching ...

UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization

Rouwan Wu, Xiaoya Cheng, Juelin Zhu, Xuxiang Liu, Maojun Zhang, Shen Yan

TL;DR

UAVD4L provides a large-scale, GPS-denied UAV localization benchmark with a world-aligned textured $3$D model and accurate $6$-DoF ground truth, enabling offline synthetic data generation and online visual localization. The authors introduce a two-stage UAVLoc pipeline that uses synthetic renders and rotation priors to constrain retrieval and a gravity-guided PnP RANSAC for robust pose estimation, followed by a hierarchical system for ground-target tracking using a wide-angle and a zoom camera with DEM-based projection. Empirical results show strong performance in image retrieval, 6-DoF localization, and target tracking, with ablations confirming benefits from multi-layer rendering and sensor priors. The dataset and code are released to advance research in GPS-denied UAV perception, navigation, and 3D target tracking. Overall, UAVD4L bridges gaps in scale, viewpoint diversity, GT accuracy, and sensor integration for airborne visual localization."

Abstract

Despite significant progress in global localization of Unmanned Aerial Vehicles (UAVs) in GPS-denied environments, existing methods remain constrained by the availability of datasets. Current datasets often focus on small-scale scenes and lack viewpoint variability, accurate ground truth (GT) pose, and UAV build-in sensor data. To address these limitations, we introduce a large-scale 6-DoF UAV dataset for localization (UAVD4L) and develop a two-stage 6-DoF localization pipeline (UAVLoc), which consists of offline synthetic data generation and online visual localization. Additionally, based on the 6-DoF estimator, we design a hierarchical system for tracking ground target in 3D space. Experimental results on the new dataset demonstrate the effectiveness of the proposed approach. Code and dataset are available at https://github.com/RingoWRW/UAVD4L

UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization

TL;DR

UAVD4L provides a large-scale, GPS-denied UAV localization benchmark with a world-aligned textured D model and accurate -DoF ground truth, enabling offline synthetic data generation and online visual localization. The authors introduce a two-stage UAVLoc pipeline that uses synthetic renders and rotation priors to constrain retrieval and a gravity-guided PnP RANSAC for robust pose estimation, followed by a hierarchical system for ground-target tracking using a wide-angle and a zoom camera with DEM-based projection. Empirical results show strong performance in image retrieval, 6-DoF localization, and target tracking, with ablations confirming benefits from multi-layer rendering and sensor priors. The dataset and code are released to advance research in GPS-denied UAV perception, navigation, and 3D target tracking. Overall, UAVD4L bridges gaps in scale, viewpoint diversity, GT accuracy, and sensor integration for airborne visual localization."

Abstract

Despite significant progress in global localization of Unmanned Aerial Vehicles (UAVs) in GPS-denied environments, existing methods remain constrained by the availability of datasets. Current datasets often focus on small-scale scenes and lack viewpoint variability, accurate ground truth (GT) pose, and UAV build-in sensor data. To address these limitations, we introduce a large-scale 6-DoF UAV dataset for localization (UAVD4L) and develop a two-stage 6-DoF localization pipeline (UAVLoc), which consists of offline synthetic data generation and online visual localization. Additionally, based on the 6-DoF estimator, we design a hierarchical system for tracking ground target in 3D space. Experimental results on the new dataset demonstrate the effectiveness of the proposed approach. Code and dataset are available at https://github.com/RingoWRW/UAVD4L
Paper Structure (31 sections, 2 equations, 9 figures, 4 tables)

This paper contains 31 sections, 2 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Top. We introduce a large-scale dataset for the 6-DoF localization of UAVs. The dataset includes a 3D reference textured model, which enables the generation of synthetic data such as rendered RGB and depth images, as well as a Digital Surface Model (DSM). Bottom. We also develop an offline-and-online pipeline for performing 6-DoF localization of UAVs in GPS-denied environments.
  • Figure 2: Distribution of query images. The query sequence consists of five trajectories, with the first four covering urban areas characterized by a high density of buildings, while the fifth trajectory covers a rural area with predominantly vegetation. The red locators, numbered from 1 to 5, represent the shooting positions.
  • Figure 3: GT poses quality on UAVD4L. Pixel-aligned renderings of the estimated camera pose confirm that the poses are sufficiently accurate for evaluation.
  • Figure 4: Overview of the proposed method. 1. We generate comprehensive synthentic data from textured 3D model, including RGB images ${I_r}$ and depthmaps ${D_r}$ (Section \ref{['database']}). 2. For each query image $I_q$, we use an image retrieval algorithm combined with rotation sensor prior $S$ to find the top-$k$ relevent images. 3. Then, we apply feature detection and matching algorithm to establish the 2D-3D correspondence between the query image $I_q$ and the relevent images ${I_r}$. A gravity-guided PnP RANSAC is used to obtain the pose $\zeta_q$ of the UAV (Section \ref{['localization']}).
  • Figure 5: The hierarchical target tracking system consists of two lens: a wide-angle lens camera and a zoom lens camera. The wide-angle lens camera is used to recover the 6-DoF pose of the UAV, while the zoom lens camera is used to accurately detect the target. A ray-tracking technique is employed to track designated targets in 3D space.
  • ...and 4 more figures