Table of Contents
Fetching ...

MALPOLON: A Framework for Deep Species Distribution Modeling

Theo Larcher, Lukas Picek, Benjamin Deneu, Titouan Lorieul, Maximilien Servajean, Alexis Joly

TL;DR

MALPOLON offers straightforward installation, YAML-based configuration, parallel computing, multi-GPU utilization, baseline and foundational models for benchmarking, and extensive tutorials/documentation, aiming to enhance accessibility and performance scalability for ecologists and researchers.

Abstract

This paper describes a deep-SDM framework, MALPOLON. Written in Python and built upon the PyTorch library, this framework aims to facilitate training and inferences of deep species distribution models (deep-SDM) and sharing for users with only general Python language skills (e.g., modeling ecologists) who are interested in testing deep learning approaches to build new SDMs. More advanced users can also benefit from the framework's modularity to run more specific experiments by overriding existing classes while taking advantage of press-button examples to train neural networks on multiple classification tasks using custom or provided raw and pre-processed datasets. The framework is open-sourced on GitHub and PyPi along with extensive documentation and examples of use in various scenarios. MALPOLON offers straightforward installation, YAML-based configuration, parallel computing, multi-GPU utilization, baseline and foundational models for benchmarking, and extensive tutorials/documentation, aiming to enhance accessibility and performance scalability for ecologists and researchers.

MALPOLON: A Framework for Deep Species Distribution Modeling

TL;DR

MALPOLON offers straightforward installation, YAML-based configuration, parallel computing, multi-GPU utilization, baseline and foundational models for benchmarking, and extensive tutorials/documentation, aiming to enhance accessibility and performance scalability for ecologists and researchers.

Abstract

This paper describes a deep-SDM framework, MALPOLON. Written in Python and built upon the PyTorch library, this framework aims to facilitate training and inferences of deep species distribution models (deep-SDM) and sharing for users with only general Python language skills (e.g., modeling ecologists) who are interested in testing deep learning approaches to build new SDMs. More advanced users can also benefit from the framework's modularity to run more specific experiments by overriding existing classes while taking advantage of press-button examples to train neural networks on multiple classification tasks using custom or provided raw and pre-processed datasets. The framework is open-sourced on GitHub and PyPi along with extensive documentation and examples of use in various scenarios. MALPOLON offers straightforward installation, YAML-based configuration, parallel computing, multi-GPU utilization, baseline and foundational models for benchmarking, and extensive tutorials/documentation, aiming to enhance accessibility and performance scalability for ecologists and researchers.
Paper Structure (10 sections, 4 figures, 2 tables)

This paper contains 10 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Graphical abstract. MALPOLON allows straightforward: (i) loading of various predictors, such as environmental rasters (e.g., land cover, human footprint), remote sensing data (e.g., Sentinel-2A and Landsat), and bioclimatic time-series, (ii) use of geospatial foundational models (e.g., SatCLIP, GeoCLIP), (iii) model training with the press of a button.
  • Figure 2: Macro structure of MALPOLON. The Examples consist of different use case experiments with pre-written plug-and-play examples and include training, inference, etc. The Engine contains everything important for datasets and models loading and usage. The Toolbox provides a collection of useful scripts to perform data pre-processing.
  • Figure 3: Main components of MALPOLON. The framework contains custom datasets and models (in blue), which data and weights are automatically retrieved from remote servers. The toolbox (in yellow), provides standalone data processing scripts. Examples (in orange) are provided when cloning the GitHub project and interact with the engine to run models for training or inference.
  • Figure 4: Spatial split of training and validation data points.