SF-Loc: A Visual Mapping and Geo-Localization System based on Sparse Visual Structure Frames
Yuxuan Zhou, Xingxing Li, Shengyu Li, Chunxi Xia, Xuanbin Wang, Shaoquan Feng
TL;DR
SF-Loc tackles the challenge of reliable, large-scale geo-localization with lightweight maps by introducing sparse visual structure frames that compress image data and dense depth. The method combines multi-sensor dense bundle adjustment (MS-DBA) for accurate mapping and a coarse-to-fine localization pipeline that leverages spatially smoothed similarity (SSS) and spatiotemporally associated similarity (SAS) to fuse multi-frame cues. Experimental results on cross-season urban data show decimeter-level re-localization with a map size of about $3$ MB/km and strong coarse-to-fine localization performance, validating the approach under GNSS outages and appearance changes. The work demonstrates practical real-time viability (≈110 ms per frame) and points to open-source release, highlighting potential impact for robotics and autonomous systems requiring robust, scalable map-aided localization.
Abstract
For high-level geo-spatial applications and intelligent robotics, accurate global pose information is of crucial importance. Map-aided localization is a universal approach to overcome the limitations of global navigation satellite system (GNSS) in challenging environments. However, current solutions face challenges in terms of mapping flexibility, storage burden and re-localization performance. In this work, we present SF-Loc, a lightweight visual mapping and map-aided localization system, whose core idea is the map representation based on sparse frames with dense but compact depth, termed as visual structure frames. In the mapping phase, multi-sensor dense bundle adjustment (MS-DBA) is applied to construct geo-referenced visual structure frames. The local co-visbility is checked to keep the map sparsity and achieve incremental mapping. In the localization phase, coarse-to-fine vision-based localization is performed, in which multi-frame information and the map distribution are fully integrated. To be specific, the concept of spatially smoothed similarity (SSS) is proposed to overcome the place ambiguity, and pairwise frame matching is applied for efficient and robust pose estimation. Experimental results on the cross-season dataset verify the effectiveness of the system. In complex urban road scenarios, the map size is down to 3 MB per kilometer and stable decimeter-level re-localization can be achieved. The code will be made open-source soon (https://github.com/GREAT-WHU/SF-Loc).
