Table of Contents
Fetching ...

Latent Domain Modeling Improves Robustness to Geographic Shifts

Ruth Crasto, Esther Rolf

TL;DR

The paper addresses geographic distribution shift by reframing it as a subpopulation shift and proposes latent domain modeling via location encoders that learn continuous domain latents conditioned on the input. The core method fuses image features with geospatial encodings through a configurable fusion module and is trained with a task loss plus an auxiliary domain-prediction loss that guides the location embeddings. Empirical results across four geo-tagged datasets show consistent improvements in worst-group performance, with new state-of-the-art on WILDS FMoW and PovertyMap, and favorable Pareto trade-offs between worst-group and average accuracy. The approach is efficient, adaptable to various location encoders and fusion strategies, and has practical implications for robust global-scale deployment in geospatial prediction tasks.

Abstract

Geographic distribution shift arises when the distribution of locations on Earth in a training dataset is different from what is seen at inference time. Using standard empirical risk minimization (ERM) in this setting can lead to uneven generalization across different spatially-determined groups of interest such as continents or biomes. The most common approaches to tackling geographic distribution shift apply domain adaptation methods using discrete group labels, ignoring geographic coordinates that are often available as metadata. On the other hand, modeling methods that integrate geographic coordinates have been shown to improve overall performance, but their impact on geographic domain generalization has not been studied. In this work, we propose a general modeling framework for improving robustness to geographic distribution shift. The key idea is to model continuous, latent domain assignment using location encoders and to condition the main task predictor on the jointly-trained latents. On four diverse geo-tagged image datasets with different group splits, we show that instances of our framework achieve significant improvements in worst-group performance compared to existing domain adaptation and location-aware modeling methods. In particular, we achieve new state-of-the-art results on two datasets from the WILDS benchmark.

Latent Domain Modeling Improves Robustness to Geographic Shifts

TL;DR

The paper addresses geographic distribution shift by reframing it as a subpopulation shift and proposes latent domain modeling via location encoders that learn continuous domain latents conditioned on the input. The core method fuses image features with geospatial encodings through a configurable fusion module and is trained with a task loss plus an auxiliary domain-prediction loss that guides the location embeddings. Empirical results across four geo-tagged datasets show consistent improvements in worst-group performance, with new state-of-the-art on WILDS FMoW and PovertyMap, and favorable Pareto trade-offs between worst-group and average accuracy. The approach is efficient, adaptable to various location encoders and fusion strategies, and has practical implications for robust global-scale deployment in geospatial prediction tasks.

Abstract

Geographic distribution shift arises when the distribution of locations on Earth in a training dataset is different from what is seen at inference time. Using standard empirical risk minimization (ERM) in this setting can lead to uneven generalization across different spatially-determined groups of interest such as continents or biomes. The most common approaches to tackling geographic distribution shift apply domain adaptation methods using discrete group labels, ignoring geographic coordinates that are often available as metadata. On the other hand, modeling methods that integrate geographic coordinates have been shown to improve overall performance, but their impact on geographic domain generalization has not been studied. In this work, we propose a general modeling framework for improving robustness to geographic distribution shift. The key idea is to model continuous, latent domain assignment using location encoders and to condition the main task predictor on the jointly-trained latents. On four diverse geo-tagged image datasets with different group splits, we show that instances of our framework achieve significant improvements in worst-group performance compared to existing domain adaptation and location-aware modeling methods. In particular, we achieve new state-of-the-art results on two datasets from the WILDS benchmark.

Paper Structure

This paper contains 43 sections, 5 equations, 5 figures, 17 tables.

Figures (5)

  • Figure 1: Illustration of our proposed framework. The three primary components are shown on the left: the image encoder, location encoder trained with domain prediction, and fusion module. The domain predictor, in dashed red outline, is discarded at inference time. In this work, we experiment with four options for the fusion module, shown on the right: feature concatenation, Geo Priors, FiLM, and D3G.
  • Figure 2: Scatter plots for different combinations of dataset and location encoder, each showing the overall average against worst group performance across different methods. Each result is averaged over the official 5 data folds for PovertyMap and over 3 random seeds for other datasets. Our proposed methods are consistently on the Pareto frontier in all plots.
  • Figure 3: Left: Map of the 14 biomes used in the iNat-Biomes dataset. Center: Visualization of location embedding clusters obtained from a WRAP Concat model trained on iNat-Biomes with domain prediction weight $\alpha = 0.2$. Each color represents a different cluster. Right: Location embedding clusters using the same model trained without domain prediction. The domain prediction loss in our proposed framework allows the location embedding space to align more closely with the true spatial distribution of domains.
  • Figure 4: Worst group accuracy of the WRAP Concat models trained on iNat-Biomes for different values of the domain prediction weight $\alpha$. Each result is averaged over 3 random seeds.
  • Figure 5: Overall average accuracy vs. worst group accuracy. Each result is averaged over 3 random seeds.