Table of Contents
Fetching ...

Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring

Sizhuo Li, Dimitri Gominski, Martin Brandt, Xiaoye Tong, Philippe Ciais

TL;DR

The paper addresses cross-domain image-level forest regression under limited target-domain data by introducing the DRIFT dataset and a two-component method. It combines Geometric Order Learning (GOL) to create a well-ordered embedding space with Manifold Diffusion for Regression (MDR) to refine predictions in a transductive, few-shot setting. Across five European countries and three forest targets, transductive GOL+MDR consistently outperforms inductive baselines, especially when domain gaps are large, and ablations show ordered embeddings are essential for MDR’s effectiveness. The work provides a practical benchmark for universal, low-data domain adaptation in Earth observation and demonstrates tangible improvements in canopy height, tree counts, and canopy cover estimation.

Abstract

Image-level regression is an important task in Earth observation, where visual domain and label shifts are a core challenge hampering generalization. However, cross-domain regression within remote sensing data remains understudied due to the absence of suited datasets. We introduce a new dataset with aerial and satellite imagery in five countries with three forest-related regression tasks. To match real-world applicative interests, we compare methods through a restrictive setup where no prior on the target domain is available during training, and models are adapted with limited information during testing. Building on the assumption that ordered relationships generalize better, we propose manifold diffusion for regression as a strong baseline for transduction in low-data regimes. Our comparison highlights the comparative advantages of inductive and transductive methods in cross-domain regression.

Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring

TL;DR

The paper addresses cross-domain image-level forest regression under limited target-domain data by introducing the DRIFT dataset and a two-component method. It combines Geometric Order Learning (GOL) to create a well-ordered embedding space with Manifold Diffusion for Regression (MDR) to refine predictions in a transductive, few-shot setting. Across five European countries and three forest targets, transductive GOL+MDR consistently outperforms inductive baselines, especially when domain gaps are large, and ablations show ordered embeddings are essential for MDR’s effectiveness. The work provides a practical benchmark for universal, low-data domain adaptation in Earth observation and demonstrates tangible improvements in canopy height, tree counts, and canopy cover estimation.

Abstract

Image-level regression is an important task in Earth observation, where visual domain and label shifts are a core challenge hampering generalization. However, cross-domain regression within remote sensing data remains understudied due to the absence of suited datasets. We introduce a new dataset with aerial and satellite imagery in five countries with three forest-related regression tasks. To match real-world applicative interests, we compare methods through a restrictive setup where no prior on the target domain is available during training, and models are adapted with limited information during testing. Building on the assumption that ordered relationships generalize better, we propose manifold diffusion for regression as a strong baseline for transduction in low-data regimes. Our comparison highlights the comparative advantages of inductive and transductive methods in cross-domain regression.
Paper Structure (23 sections, 3 equations, 13 figures, 8 tables)

This paper contains 23 sections, 3 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: Vegetation regression across countries. We hypothesize that order relations generalize better through visual and label domains than direct regression. Predicting tree counts directly requires domain-specific knowledge about local species, whereas predicting ordered chains remains intuitively easy.
  • Figure 1: Examples in the DRIFT dataset. Labels on the right: canopy height in meters (1st row), tree count (2nd row), and tree cover fraction (3rd row).
  • Figure 2: Challenging examples in the DRIFT dataset: despite similar visual content, images can have different values in the label space. More visual examples can be found in Supplementary Fig. 1.
  • Figure 2: Biogeographical region distribution reflects the diversity of biomes in the DRIFT dataset. The red polygons illustrate the exact locations from which image patches were extracted to build the DRIFT dataset. Image credits: European Environment Agency. https://www.eea.europa.eu/data-and-maps/figures/biogeographical-regions-in-europe-2
  • Figure 3: Label shift varies across countries and tasks. We plot Wasserstein distances between label distributions in country subsets to indicate the level of label shift.
  • ...and 8 more figures