Table of Contents
Fetching ...

Mapping biodiversity at very-high resolution in Europe

César Leblanc, Lukas Picek, Benjamin Deneu, Pierre Bonnet, Maximilien Servajean, Rémi Palard, Alexis Joly

TL;DR

This work introduces a cascading, multimodal pipeline to map biodiversity at a continental European scale at $50\times50\text{m}$ resolution by integrating a deep-SDM with multi-source remote sensing and climate data, computing biodiversity indicators, and inferring habitats with Pl@ntBERT-based habitat classification. The GeoPlant dataset supports learning from both presence-only and presence-absence data, enabling joint modeling of interspecies dependencies while mitigating sampling bias. The approach yields high-resolution species distribution maps for thousands of species, seven indicator maps with quantified uncertainty, and extensive habitat maps (EUNIS Level 3) across Europe, demonstrating strong discriminatory performance (e.g., AUC $=0.931$) and practical utility for conservation and land-use planning. Despite scale-related evaluation challenges and data biases, the framework offers a scalable, interpretable pipeline for dynamic biodiversity monitoring aligned with the EU biodiversity strategy.

Abstract

This paper describes a cascading multimodal pipeline for high-resolution biodiversity mapping across Europe, integrating species distribution modeling, biodiversity indicators, and habitat classification. The proposed pipeline first predicts species compositions using a deep-SDM, a multimodal model trained on remote sensing, climate time series, and species occurrence data at 50x50m resolution. These predictions are then used to generate biodiversity indicator maps and classify habitats with Pl@ntBERT, a transformer-based LLM designed for species-to-habitat mapping. With this approach, continental-scale species distribution maps, biodiversity indicator maps, and habitat maps are produced, providing fine-grained ecological insights. Unlike traditional methods, this framework enables joint modeling of interspecies dependencies, bias-aware training with heterogeneous presence-absence data, and large-scale inference from multi-source remote sensing inputs.

Mapping biodiversity at very-high resolution in Europe

TL;DR

This work introduces a cascading, multimodal pipeline to map biodiversity at a continental European scale at resolution by integrating a deep-SDM with multi-source remote sensing and climate data, computing biodiversity indicators, and inferring habitats with Pl@ntBERT-based habitat classification. The GeoPlant dataset supports learning from both presence-only and presence-absence data, enabling joint modeling of interspecies dependencies while mitigating sampling bias. The approach yields high-resolution species distribution maps for thousands of species, seven indicator maps with quantified uncertainty, and extensive habitat maps (EUNIS Level 3) across Europe, demonstrating strong discriminatory performance (e.g., AUC ) and practical utility for conservation and land-use planning. Despite scale-related evaluation challenges and data biases, the framework offers a scalable, interpretable pipeline for dynamic biodiversity monitoring aligned with the EU biodiversity strategy.

Abstract

This paper describes a cascading multimodal pipeline for high-resolution biodiversity mapping across Europe, integrating species distribution modeling, biodiversity indicators, and habitat classification. The proposed pipeline first predicts species compositions using a deep-SDM, a multimodal model trained on remote sensing, climate time series, and species occurrence data at 50x50m resolution. These predictions are then used to generate biodiversity indicator maps and classify habitats with Pl@ntBERT, a transformer-based LLM designed for species-to-habitat mapping. With this approach, continental-scale species distribution maps, biodiversity indicator maps, and habitat maps are produced, providing fine-grained ecological insights. Unlike traditional methods, this framework enables joint modeling of interspecies dependencies, bias-aware training with heterogeneous presence-absence data, and large-scale inference from multi-source remote sensing inputs.

Paper Structure

This paper contains 12 sections, 5 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Geo spatial scale of the dataset (from geoplant2024picek). The 5M PO occurrences (9,709 species) span all of Europe, but the 90K PA surveys (5,016 species) are primarily in France and Denmark.
  • Figure 2: Selected SDM architecture (from geoplant2024picek). This multimodal ensemble model processes each modality (i.e., satellite images, climatic cubes, and Landsat cubes) through a lightweight 6-layer residual encoder (i.e., ResNet-6). The embeddings are then concatenated and passed to a final classification layer.
  • Figure 3: Pl@ntBERT HDM (from leblanc2024pl) processes the input (list of species predicted by the deep-SDM) through multiple encoder layers, with the [CLS] token representation passed to a classifier to predict the most likely habitat type.
  • Figure 4: Example species distribution maps for two selected species occurring in France and Greece at different zoom levels. These maps are produced by the deep-SDM all over Europe for over 5,500 plant species at a 50$\times$50m resolution.
  • Figure 5: Example biodiversity indicator maps for two selected indicators occurring in Belgium and the Czech Republic at different zoom levels. These maps are produced with the output of the deep-SDM all over Europe for seven biodiversity indicators at a 50$\times$50m resolution.
  • ...and 1 more figures