$A^2$GC: $A$symmetric $A$ggregation with Geometric Constraints for Locally Aggregated Descriptors
Zhenyu Li, Tianyi Shang
TL;DR
This work addresses Visual Place Recognition (VPR) under distributional and geometric variability by replacing symmetric optimal transport with an asymmetric aggregation framework and by incorporating geometric constraints. The A^2GC-VPR method combines row-column normalization and independent marginal calibration to adapt to imbalanced feature and cluster distributions, while learnable coordinate embeddings promote spatially coherent assignments via a geometric compatibility score. Empirical results on Pitts30k, Pitts250k, MSLS, Nordland, and SPED show state-of-the-art or competitive performance, with ablations confirming the complementary benefits of asymmetric aggregation and geometric constraints. The approach also demonstrates strong cross-domain generalization and practical efficiency, making it well-suited for real-world VPR deployments. Overall, A^2GC advances VPR by integrating distribution-aware transport with spatially aware feature aggregation, yielding robust and scalable place recognition across diverse conditions.
Abstract
Visual Place Recognition (VPR) aims to match query images against a database using visual cues. State-of-the-art methods aggregate features from deep backbones to form global descriptors. Optimal transport-based aggregation methods reformulate feature-to-cluster assignment as a transport problem, but the standard Sinkhorn algorithm symmetrically treats source and target marginals, limiting effectiveness when image features and cluster centers exhibit substantially different distributions. We propose an asymmetric aggregation VPR method with geometric constraints for locally aggregated descriptors, called $A^2$GC-VPR. Our method employs row-column normalization averaging with separate marginal calibration, enabling asymmetric matching that adapts to distributional discrepancies in visual place recognition. Geometric constraints are incorporated through learnable coordinate embeddings, computing compatibility scores fused with feature similarities, thereby promoting spatially proximal features to the same cluster and enhancing spatial awareness. Experimental results on MSLS, NordLand, and Pittsburgh datasets demonstrate superior performance, validating the effectiveness of our approach in improving matching accuracy and robustness.
