Collaborative Visual Place Recognition through Federated Learning
Mattia Dutto, Gabriele Berton, Debora Caldarola, Eros Fanì, Gabriele Trivigno, Carlo Masone
TL;DR
This paper tackles the challenge of training Visual Place Recognition models in a privacy-preserving, distributed setting by introducing FedVPR, a federated learning framework where geospatially diverse clients perform local mining and train on private data while a central server aggregates updates via FedAvg. It formalizes the problem for VPR in FL, analyzes data and system heterogeneity, and proposes three MSLS-based federated dataset splits (Proximity, Clustering, Random) to mimic real-world deployments. Through extensive experiments, the authors show that FedVPR can approach centralized performance with careful design choices, while highlighting the impacts of local data quantity, augmentations, and mining distributions on learning. The work provides a practical foundation for privacy-preserving VPR and opens avenues for applying federated learning to other image retrieval tasks on edge devices.
Abstract
Visual Place Recognition (VPR) aims to estimate the location of an image by treating it as a retrieval problem. VPR uses a database of geo-tagged images and leverages deep neural networks to extract a global representation, called descriptor, from each image. While the training data for VPR models often originates from diverse, geographically scattered sources (geo-tagged images), the training process itself is typically assumed to be centralized. This research revisits the task of VPR through the lens of Federated Learning (FL), addressing several key challenges associated with this adaptation. VPR data inherently lacks well-defined classes, and models are typically trained using contrastive learning, which necessitates a data mining step on a centralized database. Additionally, client devices in federated systems can be highly heterogeneous in terms of their processing capabilities. The proposed FedVPR framework not only presents a novel approach for VPR but also introduces a new, challenging, and realistic task for FL research, paving the way to other image retrieval tasks in FL.
