Table of Contents
Fetching ...

Multi-Scale and Multimodal Species Distribution Modeling

Nina van Tiel, Robin Zbinden, Emanuele Dalsasso, Benjamin Kellenberger, Loïc Pellissier, Devis Tuia

TL;DR

This work develops a modular structure for SDMs that allows different scales to be considered for different modalities, using a late fusion approach, and results indicate that considering multimodal data and learning multi-scale representations leads to more accurate models.

Abstract

Species distribution models (SDMs) aim to predict the distribution of species by relating occurrence data with environmental variables. Recent applications of deep learning to SDMs have enabled new avenues, specifically the inclusion of spatial data (environmental rasters, satellite images) as model predictors, allowing the model to consider the spatial context around each species' observations. However, the appropriate spatial extent of the images is not straightforward to determine and may affect the performance of the model, as scale is recognized as an important factor in SDMs. We develop a modular structure for SDMs that allows us to test the effect of scale in both single- and multi-scale settings. Furthermore, our model enables different scales to be considered for different modalities, using a late fusion approach. Results on the GeoLifeCLEF 2023 benchmark indicate that considering multimodal data and learning multi-scale representations leads to more accurate models.

Multi-Scale and Multimodal Species Distribution Modeling

TL;DR

This work develops a modular structure for SDMs that allows different scales to be considered for different modalities, using a late fusion approach, and results indicate that considering multimodal data and learning multi-scale representations leads to more accurate models.

Abstract

Species distribution models (SDMs) aim to predict the distribution of species by relating occurrence data with environmental variables. Recent applications of deep learning to SDMs have enabled new avenues, specifically the inclusion of spatial data (environmental rasters, satellite images) as model predictors, allowing the model to consider the spatial context around each species' observations. However, the appropriate spatial extent of the images is not straightforward to determine and may affect the performance of the model, as scale is recognized as an important factor in SDMs. We develop a modular structure for SDMs that allows us to test the effect of scale in both single- and multi-scale settings. Furthermore, our model enables different scales to be considered for different modalities, using a late fusion approach. Results on the GeoLifeCLEF 2023 benchmark indicate that considering multimodal data and learning multi-scale representations leads to more accurate models.

Paper Structure

This paper contains 11 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Example of architectures with our modular structure for SDMs. a. Single-scale, unimodal model architecture for bioclimatic variables at scale $(5 \times 5)$. b. Multi-scale unimodal model architecture for bioclimatic variables at scales $(1\times 1)$, $(5 \times 5)$ and $(9\times 9)$. c. Multi-scale multimodal model architecture for bioclimatic variables at scale $(1\times 1)$, and Sentinel-2 satellite images at scales $(25 \times 25)$, $(59\times 59)$ and $(115\times 115)$. The receptive fields after the encoders for bioclimatic variables and satellite image are $1\times 1$ and $25 \times 25$ pixels, respectively.
  • Figure 2: Performance of bimodal and corresponding unimodal models, quantified by their validation median AUC, and test micro-F1 scores.
  • Figure 3: Difference in median AUC ($\Delta$AUC) and micro-F1 ($\Delta$F1) between two unimodal and two bimodal models. Positive $\Delta$ values indicate that models [bioclim 1,5] or [bioclim 1, 5 + sat 25, 59, 115] outperform models [bioclim 1] or [bioclim 1 + sat 25, 59, 115], respectively, and vice-versa for negative values. a, b, d, e.$\Delta$AUC values plotted against the number of occurrences in the training data (nb train) and the validation data (nb val) for $2,173$ species. Colors indicate point density, with higher densities in yellow. c, f.$7,348$ validation sites plotted on maps and colored by $\Delta$F1.