BEVRender: Vision-based Cross-view Vehicle Registration in Off-road GNSS-denied Environment
Lihong Jin, Wei Dong, Wenshan Wang, Michael Kaess
TL;DR
BEVRender addresses the problem of GNSS-denied off-road vehicle localization where lack of distinct landmarks and GNSS outages hinder vision-based approaches. It introduces a learning-based pipeline that synthesizes local BEV images from multi-view camera data using a BEVFormer-inspired feature encoder with deformable attention, followed by a CNN-based BEV rendering head. These local BEV images are registered to a geo-referenced aerial map via NCC-based template matching to achieve accurate 2D localization while reducing the storage burden typical of image-retrieval methods. Real-world experiments on Pittsburgh data show improved localization accuracy and update frequency, with ablations and cross-sequence tests demonstrating robustness and generalization to unseen trajectories. The approach offers a practical, scalable solution for online GNSS-denied localization in off-road environments, potentially enabling more reliable autonomous operation where GPS is unavailable or unreliable.
Abstract
We introduce BEVRender, a novel learning based approach for the localization of ground vehicles in Global Navigation Satellite System(GNSS)-denied off-road scenarios. These environments are typically challenging for conventional vision-based state estimation due to the lack of distinct visual landmarks and the instability of vehicle poses. To address this, BEVRender generates high-quality local bird's-eye-view(BEV) images of the local terrain. Subsequently, these images are aligned with a geo referenced aerial map through template matching to achieve accurate cross-view registration. Our approach overcomes the inherent limitations of visual inertial odometry systems and the substantial storage requirements of image-retrieval localization strategies, which are susceptible to drift and scalability issues, respectively. Extensive experimentation validates BEVRender's advancement over existing GNSS-denied visual localization methods, demonstrating notable enhancements in both localization accuracy and update frequency.
