Table of Contents
Fetching ...

AGBD: A Global-scale Biomass Dataset

Ghjulia Sialelli, Torben Peters, Jan D. Wegner, Konrad Schindler

TL;DR

The paper presents AGBD, the first globally distributed, high-resolution (10 m) benchmark dataset for Above Ground Biomass estimation that fuses GEDI AGB references with Sentinel-2 and PALSAR-2 data, augmented by precomputed canopy height, elevation, and land-cover maps. It provides a scalable ML-ready resource spanning 2019–2020, along with a dense 10 m AGB prediction map and a suite of baseline models to facilitate benchmarking. Through careful region selection and patch-based train/validation/test splits that preserve vegetation distributions, the study demonstrates that multi-modal inputs improve AGB predictions (with RMSE near 60 Mg/ha) and that canopy height alone is insufficient for accurate AGB estimation. The dataset, benchmarks, and dense predictions are publicly available, offering a practical foundation for global high-resolution biomass mapping and future operational monitoring efforts.

Abstract

Accurate estimates of Above Ground Biomass (AGB) are essential in addressing two of humanity's biggest challenges: climate change and biodiversity loss. Existing datasets for AGB estimation from satellite imagery are limited. Either they focus on specific, local regions at high resolution, or they offer global coverage at low resolution. There is a need for a machine learning-ready, globally representative, high-resolution benchmark dataset. Our findings indicate significant variability in biomass estimates across different vegetation types, emphasizing the necessity for a dataset that accurately captures global diversity. To address these gaps, we introduce a comprehensive new dataset that is globally distributed, covers a range of vegetation types, and spans several years. This dataset combines AGB reference data from the GEDI mission with data from Sentinel-2 and PALSAR-2 imagery. Additionally, it includes pre-processed high-level features such as a dense canopy height map, an elevation map, and a land-cover classification map. We also produce a dense, high-resolution (10m) map of AGB predictions for the entire area covered by the dataset. Rigorously tested, our dataset is accompanied by several benchmark models and is publicly available. It can be easily accessed using a single line of code, offering a solid basis for efforts towards global AGB estimation. The GitHub repository github.com/ghjuliasialelli/AGBD serves as a one-stop shop for all code and data.

AGBD: A Global-scale Biomass Dataset

TL;DR

The paper presents AGBD, the first globally distributed, high-resolution (10 m) benchmark dataset for Above Ground Biomass estimation that fuses GEDI AGB references with Sentinel-2 and PALSAR-2 data, augmented by precomputed canopy height, elevation, and land-cover maps. It provides a scalable ML-ready resource spanning 2019–2020, along with a dense 10 m AGB prediction map and a suite of baseline models to facilitate benchmarking. Through careful region selection and patch-based train/validation/test splits that preserve vegetation distributions, the study demonstrates that multi-modal inputs improve AGB predictions (with RMSE near 60 Mg/ha) and that canopy height alone is insufficient for accurate AGB estimation. The dataset, benchmarks, and dense predictions are publicly available, offering a practical foundation for global high-resolution biomass mapping and future operational monitoring efforts.

Abstract

Accurate estimates of Above Ground Biomass (AGB) are essential in addressing two of humanity's biggest challenges: climate change and biodiversity loss. Existing datasets for AGB estimation from satellite imagery are limited. Either they focus on specific, local regions at high resolution, or they offer global coverage at low resolution. There is a need for a machine learning-ready, globally representative, high-resolution benchmark dataset. Our findings indicate significant variability in biomass estimates across different vegetation types, emphasizing the necessity for a dataset that accurately captures global diversity. To address these gaps, we introduce a comprehensive new dataset that is globally distributed, covers a range of vegetation types, and spans several years. This dataset combines AGB reference data from the GEDI mission with data from Sentinel-2 and PALSAR-2 imagery. Additionally, it includes pre-processed high-level features such as a dense canopy height map, an elevation map, and a land-cover classification map. We also produce a dense, high-resolution (10m) map of AGB predictions for the entire area covered by the dataset. Rigorously tested, our dataset is accompanied by several benchmark models and is publicly available. It can be easily accessed using a single line of code, offering a solid basis for efforts towards global AGB estimation. The GitHub repository github.com/ghjuliasialelli/AGBD serves as a one-stop shop for all code and data.
Paper Structure (21 sections, 11 figures, 3 tables)

This paper contains 21 sections, 11 figures, 3 tables.

Figures (11)

  • Figure 1: The regions of interest: California (USA), Cuba, Austria, Greece, Nepal, Shaanxi (China), French Guiana, Paraguay, Ghana, Tanzania, New Zealand.
  • Figure 2: Per-biome distribution of GEDI AGBD values and ESA CCI residuals. Gray biomes are outside of GEDI coverage.
  • Figure 3: Land cover distribution across the world (left) and across our subset (right).
  • Figure 4: Visualization of all data sources along with our training pipeline. Note the sparse GEDI labels, which are upgraded to dense AGB maps by the prediction.
  • Figure 5: Binned test residuals for the best-performing model of each architecture, and for the ESA CCI predictions.
  • ...and 6 more figures