Applying the maximum entropy principle to neural networks enhances multi-species distribution models
Maxime Ryckewaert, Diego Marcos, Christophe Botella, Maximilien Servajean, Pierre Bonnet, Alexis Joly
TL;DR
DeepMaxent advances presence-only species distribution modelling by marrying the maximum entropy principle with a shared neural feature extractor to learn multi-species presence intensities via a normalised Poisson loss. It generalises Maxent, remains connected to Poisson likelihood, and implicitly implements Target-Group Background bias correction through batched normalisation, enabling scalable training. Empirical evaluation on the NCEAS and GeoPlant benchmarks shows DeepMaxent outperforms traditional Maxent and other SDMs, with notable gains in regions with sampling bias and for rare species. The approach offers a flexible, data-driven framework capable of integrating high-dimensional inputs (e.g., remote sensing) and scaling to thousands of species, with strong potential for extensions to integrated SDMs and trait-based interspecific coupling.
Abstract
The rapid expansion of citizen science initiatives has led to a significant growth of biodiversity databases, and particularly presence-only (PO) observations. PO data are invaluable for understanding species distributions and their dynamics, but their use in a Species Distribution Model (SDM) is curtailed by sampling biases and the lack of information on absences. Poisson point processes are widely used for SDMs, with Maxent being one of the most popular methods. Maxent maximises the entropy of a probability distribution across sites as a function of predefined transformations of variables, called features. In contrast, neural networks and deep learning have emerged as a promising technique for automatic feature extraction from complex input variables. Arbitrarily complex transformations of input variables can be learned from the data efficiently through backpropagation and stochastic gradient descent (SGD). In this paper, we propose DeepMaxent, which harnesses neural networks to automatically learn shared features among species, using the maximum entropy principle. To do so, it employs a normalised Poisson loss where for each species, presence probabilities across sites are modelled by a neural network. We evaluate DeepMaxent on a benchmark dataset known for its spatial sampling biases, using PO data for calibration and presence-absence (PA) data for validation across six regions with different biological groups and covariates. Our results indicate that DeepMaxent performs better than Maxent and other leading SDMs across all regions and taxonomic groups. The method performs particularly well in regions of uneven sampling, demonstrating substantial potential to increase SDM performances. In particular, our approach yields more accurate predictions than traditional single-species models, which opens up new possibilities for methodological enhancement.
