Image-Based Relocalization and Alignment for Long-Term Monitoring of Dynamic Underwater Environments
Beverley Gorry, Tobias Fischer, Michael Milford, Alejandro Fontan
TL;DR
This work tackles long-term underwater monitoring by introducing a hierarchical Visual Place Recognition pipeline that fuses global image retrieval with local feature refinement, followed by homography-based registration and 2D warping of segmentation masks to enable pixel-level change analysis. It introduces the SQUIDLE+ VPR Benchmark, the first large-scale underwater VPR dataset drawn from publicly available SQUIDLE+ data, to evaluate cross-time localization across diverse trajectories and environmental conditions. The approach demonstrates substantial speedups over brute-force methods while maintaining competitive accuracy and provides qualitative and quantitative insights into change detection via segmentation warping. The dataset and method offer a scalable, centimeter-level registration capability essential for monitoring dynamic marine ecosystems and informing conservation efforts.
Abstract
Effective monitoring of underwater ecosystems is crucial for tracking environmental changes, guiding conservation efforts, and ensuring long-term ecosystem health. However, automating underwater ecosystem management with robotic platforms remains challenging due to the complexities of underwater imagery, which pose significant difficulties for traditional visual localization methods. We propose an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images. This method enables robust identification of revisited areas, estimation of rigid transformations, and downstream analysis of ecosystem changes. Furthermore, we introduce the SQUIDLE+ VPR Benchmark-the first large-scale underwater VPR benchmark designed to leverage an extensive collection of unstructured data from multiple robotic platforms, spanning time intervals from days to years. The dataset encompasses diverse trajectories, arbitrary overlap and diverse seafloor types captured under varying environmental conditions, including differences in depth, lighting, and turbidity. Our code is available at: https://github.com/bev-gorry/underloc
