Table of Contents
Fetching ...

UniABG: Unified Adversarial View Bridging and Graph Correspondence for Unsupervised Cross-View Geo-Localization

Cuiqun Chen, Qi Chen, Bin Yang, Xingyi Zhang

TL;DR

UniABG tackles unsupervised cross-view geo-localization by addressing both view discrepancy and noisy pseudo-label propagation. It introduces a dual-stage framework that first bridges drone-satellite appearance gaps with View-Aware Adversarial Bridging (including an Auxiliary Pseudo View generated via color transfer) and then refines cross-view correspondences with Heterogeneous Graph Filtering Calibration to enforce structural consistency across views. The approach yields state-of-the-art unsupervised performance on University-1652 and SUES-200, even surpassing many supervised baselines, demonstrating strong robustness to domain gaps and annotation scarcity. Collectively, UniABG offers a scalable, label-free solution with practical impact for precise localization in aerial and satellite imagery contexts, and it highlights the value of combining adversarial view alignment with graph-based purification in cross-modal matching.

Abstract

Cross-view geo-localization (CVGL) matches query images ($\textit{e.g.}$, drone) to geographically corresponding opposite-view imagery ($\textit{e.g.}$, satellite). While supervised methods achieve strong performance, their reliance on extensive pairwise annotations limits scalability. Unsupervised alternatives avoid annotation costs but suffer from noisy pseudo-labels due to intrinsic cross-view domain gaps. To address these limitations, we propose $\textit{UniABG}$, a novel dual-stage unsupervised cross-view geo-localization framework integrating adversarial view bridging with graph-based correspondence calibration. Our approach first employs View-Aware Adversarial Bridging (VAAB) to model view-invariant features and enhance pseudo-label robustness. Subsequently, Heterogeneous Graph Filtering Calibration (HGFC) refines cross-view associations by constructing dual inter-view structure graphs, achieving reliable view correspondence. Extensive experiments demonstrate state-of-the-art unsupervised performance, showing that UniABG improves Satellite $\rightarrow$ Drone AP by +10.63\% on University-1652 and +16.73\% on SUES-200, even surpassing supervised baselines. The source code is available at https://github.com/chenqi142/UniABG

UniABG: Unified Adversarial View Bridging and Graph Correspondence for Unsupervised Cross-View Geo-Localization

TL;DR

UniABG tackles unsupervised cross-view geo-localization by addressing both view discrepancy and noisy pseudo-label propagation. It introduces a dual-stage framework that first bridges drone-satellite appearance gaps with View-Aware Adversarial Bridging (including an Auxiliary Pseudo View generated via color transfer) and then refines cross-view correspondences with Heterogeneous Graph Filtering Calibration to enforce structural consistency across views. The approach yields state-of-the-art unsupervised performance on University-1652 and SUES-200, even surpassing many supervised baselines, demonstrating strong robustness to domain gaps and annotation scarcity. Collectively, UniABG offers a scalable, label-free solution with practical impact for precise localization in aerial and satellite imagery contexts, and it highlights the value of combining adversarial view alignment with graph-based purification in cross-modal matching.

Abstract

Cross-view geo-localization (CVGL) matches query images (, drone) to geographically corresponding opposite-view imagery (, satellite). While supervised methods achieve strong performance, their reliance on extensive pairwise annotations limits scalability. Unsupervised alternatives avoid annotation costs but suffer from noisy pseudo-labels due to intrinsic cross-view domain gaps. To address these limitations, we propose , a novel dual-stage unsupervised cross-view geo-localization framework integrating adversarial view bridging with graph-based correspondence calibration. Our approach first employs View-Aware Adversarial Bridging (VAAB) to model view-invariant features and enhance pseudo-label robustness. Subsequently, Heterogeneous Graph Filtering Calibration (HGFC) refines cross-view associations by constructing dual inter-view structure graphs, achieving reliable view correspondence. Extensive experiments demonstrate state-of-the-art unsupervised performance, showing that UniABG improves Satellite Drone AP by +10.63\% on University-1652 and +16.73\% on SUES-200, even surpassing supervised baselines. The source code is available at https://github.com/chenqi142/UniABG

Paper Structure

This paper contains 19 sections, 25 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Key challenges in unsupervised CVGL. (a) Drastic appearance differences between drone and satellite views. (b) Feature space ambiguity, where the distance between different views of the same category may exceed the distance within different categories. (c) An enlarged view of the clustering space of category A ($\mathbf{C_a}$). Noisy instances within clusters leading to incorrect pseudo-label association. Different colors represent different categories.
  • Figure 2: The overall architecture of our proposed UniABG is a dual-stage model. The first stage employs adversarial learning to reduce the differences between perspectives. The second stage constructs cross-view association data through heterogeneous graph filtering calibration for supervised learning (for more details, please refer to the text).
  • Figure 3: The t-SNE visualization of the baseline and our proposed method. We randomly selected 50 location categories from the University-1652 dataset, with each color representing a location. "$\triangle$" represents satellite features, and "$\circ$" represents UAV features. Red circles indicate ambiguous matches.
  • Figure 4: Visual examples of the generated Auxiliary Pseudo View (APV). For each sample, the left column shows the original drone-view image. The middle column displays the generated APV after applying color transfer. The right column presents the corresponding ground-truth satellite-view image.