Evaluating Global Geo-alignment for Precision Learned Autonomous Vehicle Localization using Aerial Data
Yi Yang, Xuran Zhao, H. Charles Zhao, Shumin Yuan, Samuel M. Bateman, Tiffany A. Huang, Chris Beall, Will Maddern
TL;DR
This work tackles sub-meter global localization for autonomous vehicles by leveraging aerial data and two training-time data-alignment strategies. It shows that aligning aerial maps to vehicle data (or vice versa) before learning significantly improves localization accuracy, and it introduces a two-branch cross-modal localization model trained with alignment-derived ground truth. On a large-scale 1600 km SF Bay Area dataset, vehicle-to-map alignment with DSM and RGB imagery yields sub-meter position error and sub-degree yaw error, while even RGB-only data can achieve competitive medians when trained with proper alignment. The findings underscore the practical potential of low-cost aerial data for precise, scalable autonomous-vehicle localization and guide future improvements in cross-modal geo-localization pipelines.
Abstract
Recently there has been growing interest in the use of aerial and satellite map data for autonomous vehicles, primarily due to its potential for significant cost reduction and enhanced scalability. Despite the advantages, aerial data also comes with challenges such as a sensor-modality gap and a viewpoint difference gap. Learned localization methods have shown promise for overcoming these challenges to provide precise metric localization for autonomous vehicles. Most learned localization methods rely on coarsely aligned ground truth, or implicit consistency-based methods to learn the localization task -- however, in this paper we find that improving the alignment between aerial data and autonomous vehicle sensor data at training time is critical to the performance of a learning-based localization system. We compare two data alignment methods using a factor graph framework and, using these methods, we then evaluate the effects of closely aligned ground truth on learned localization accuracy through ablation studies. Finally, we evaluate a learned localization system using the data alignment methods on a comprehensive (1600km) autonomous vehicle dataset and demonstrate localization error below 0.3m and 0.5$^{\circ}$ sufficient for autonomous vehicle applications.
