Generating Binary Species Range Maps
Filip Dorm, Christian Lange, Scott Loarie, Oisin Mac Aodha
TL;DR
The paper tackles binarizing deep, multi-species SDMs trained on presence-only data to generate binary range maps. It compares multiple thresholding strategies and introduces LPT-R, an absence-free approach that uses the $ ext{5th percentile}$ of presences to set per-species thresholds, improving robustness to outliers. Evaluations on global expert-derived ranges (IUCN) and presence-absence benchmarks (S&T) show that LPT-R often yields the highest mean F1 and that thresholding can rival, or outperform, pseudo-absence–based methods, while also enabling binary ranges to serve as geo priors for large-scale image classification. The work demonstrates practical benefits in using species-specific thresholds without generating pseudo-absences, though data biases and the absence of environmental covariates in main experiments are acknowledged, and temporal extensions are proposed for future work. The method offers computational efficiency and utility for conservation planning and CV tasks that rely on geographic priors.
Abstract
Accurately predicting the geographic ranges of species is crucial for assisting conservation efforts. Traditionally, range maps were manually created by experts. However, species distribution models (SDMs) and, more recently, deep learning-based variants offer a potential automated alternative. Deep learning-based SDMs generate a continuous probability representing the predicted presence of a species at a given location, which must be binarized by setting per-species thresholds to obtain binary range maps. However, selecting appropriate per-species thresholds to binarize these predictions is non-trivial as different species can require distinct thresholds. In this work, we evaluate different approaches for automatically identifying the best thresholds for binarizing range maps using presence-only data. This includes approaches that require the generation of additional pseudo-absence data, along with ones that only require presence data. We also propose an extension of an existing presence-only technique that is more robust to outliers. We perform a detailed evaluation of different thresholding techniques on the tasks of binary range estimation and large-scale fine-grained visual classification, and we demonstrate improved performance over existing pseudo-absence free approaches using our method.
