Table of Contents
Fetching ...

BioAnalyst: A Foundation Model for Biodiversity

Athanasios Trantas, Martino Mensio, Stylianos Stasinos, Sebastian Gribincea, Taimur Khan, Damian Podareanu, Aliene van der Veen

TL;DR

BioAnalyst introduces the first multimodal foundation model tailored for biodiversity analytics, integrating 10 data modalities at a $0.25^{\circ}$ grid to forecast regional to national ecological dynamics in Europe. The architecture combines a Perceiver IO encoder, a 3D Swin Transformer backbone, and a Perceiver IO decoder, trained on BioCube with two-time-step inputs and refined via roll-out fine-tuning using VeRA adapters. It demonstrates strong results on downstream tasks, including joint species distribution modelling and abiotic climate reconstruction, and enables an open-source workflow for reproducible research. The work highlights the potential of integrated multimodal representations to advance macroecological forecasting while acknowledging limitations such as uncertainty quantification and regional scope, pointing toward future enhancements and broader applicability.

Abstract

Multimodal Foundation Models (FMs) offer a path to learn general-purpose representations from heterogeneous ecological data, easily transferable to downstream tasks. However, practical biodiversity modelling remains fragmented; separate pipelines and models are built for each dataset and objective, which limits reuse across regions and taxa. In response, we present BioAnalyst, to our knowledge the first multimodal Foundation Model tailored to biodiversity analysis and conservation planning in Europe at $0.25^{\circ}$ spatial resolution targeting regional to national-scale applications. BioAnalyst employs a transformer-based architecture, pre-trained on extensive multimodal datasets that align species occurrence records with remote sensing indicators, climate and environmental variables. Post pre-training, the model is adapted via lightweight roll-out fine-tuning to a range of downstream tasks, including joint species distribution modelling, biodiversity dynamics and population trend forecasting. The model is evaluated on two representative downstream use cases: (i) joint species distribution modelling and with 500 vascular plant species (ii) monthly climate linear probing with temperature and precipitation data. Our findings show that BioAnalyst can provide a strong baseline both for biotic and abiotic tasks, acting as a macroecological simulator with a yearly forecasting horizon and monthly resolution, offering the first application of this type of modelling in the biodiversity domain. We have open-sourced the model weights, training and fine-tuning pipelines to advance AI-driven ecological research.

BioAnalyst: A Foundation Model for Biodiversity

TL;DR

BioAnalyst introduces the first multimodal foundation model tailored for biodiversity analytics, integrating 10 data modalities at a grid to forecast regional to national ecological dynamics in Europe. The architecture combines a Perceiver IO encoder, a 3D Swin Transformer backbone, and a Perceiver IO decoder, trained on BioCube with two-time-step inputs and refined via roll-out fine-tuning using VeRA adapters. It demonstrates strong results on downstream tasks, including joint species distribution modelling and abiotic climate reconstruction, and enables an open-source workflow for reproducible research. The work highlights the potential of integrated multimodal representations to advance macroecological forecasting while acknowledging limitations such as uncertainty quantification and regional scope, pointing toward future enhancements and broader applicability.

Abstract

Multimodal Foundation Models (FMs) offer a path to learn general-purpose representations from heterogeneous ecological data, easily transferable to downstream tasks. However, practical biodiversity modelling remains fragmented; separate pipelines and models are built for each dataset and objective, which limits reuse across regions and taxa. In response, we present BioAnalyst, to our knowledge the first multimodal Foundation Model tailored to biodiversity analysis and conservation planning in Europe at spatial resolution targeting regional to national-scale applications. BioAnalyst employs a transformer-based architecture, pre-trained on extensive multimodal datasets that align species occurrence records with remote sensing indicators, climate and environmental variables. Post pre-training, the model is adapted via lightweight roll-out fine-tuning to a range of downstream tasks, including joint species distribution modelling, biodiversity dynamics and population trend forecasting. The model is evaluated on two representative downstream use cases: (i) joint species distribution modelling and with 500 vascular plant species (ii) monthly climate linear probing with temperature and precipitation data. Our findings show that BioAnalyst can provide a strong baseline both for biotic and abiotic tasks, acting as a macroecological simulator with a yearly forecasting horizon and monthly resolution, offering the first application of this type of modelling in the biodiversity domain. We have open-sourced the model weights, training and fine-tuning pipelines to advance AI-driven ecological research.

Paper Structure

This paper contains 51 sections, 20 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: BioAnalyst is the first large-scale multi-modal model for biodiversity, trained on 20 years of spatiotemporal data modalities. The model ingests 10 distinct modalities, encoding and aligning them to latent ecological representations via the 3D Perceiver IO encoder. It then processes the latent space with the 3D Swin Transformer backbone and decodes it back to produce accurate spatiotemporal predictions. BioAnalyst shows strong performance in downstream tasks like (i)biotic, (ii) abiotic features prediction, (iii) long horizon prediction (12 timesteps = 1 year), both across space and time and (iv) is easily fine-tunable for any downstream task.
  • Figure 2: A visual explanation of the data pipeline. From left to right, we received the data from BioCube in a HyperCube format, where sampling a single-timestep slice produces a Data Batch containing worldwide observations. Selecting European coordinates produces a Data Sample with multiple modalities stacked on the selected coordinate grid of size [160, 280].
  • Figure 3: Mean Absolute Error for the 28 animal species on a 12-step rollout. Blue line highlights the performance of roll-out finetuned BioAnalyst for $K=6$ steps while orange line for $K=12$ steps.
  • Figure 4: Community Sørensen similarity between predicted and observed plant assemblages for 28 animal species across Europe, based on GBIF presence records. The mean similarity is $\bar{S}=0.31$, indicating that the model recovers roughly one third of the recorded community composition. Warm colours (yellow–red) denote higher assemblage agreement, while cool blues indicate little overlap and/or sparsely sampled regions.
  • Figure 5: (a) Ground truth and prediction spatial plots for the species Pieris brassicae (ID 1920506) on 01-4-2019 and MAE of 0.00003. (b) Zoomed in plot, highlighting areas of interest, where the model can capture the general distribution, although unable to capture high-density areas.
  • ...and 7 more figures