Leafy Spurge Dataset: Real-world Weed Classification Within Aerial Drone Imagery
Kyle Doherty, Max Gurinas, Erik Samsoe, Charles Casper, Beau Larkin, Philip Ramsey, Brandon Trabucco, Ruslan Salakhutdinov
TL;DR
The Leafy Spurge Dataset introduces a real-world, drone-imagery benchmark for presence/absence classification of the invasive weed Euphorbia esula in western Montana. It combines high-resolution aerial imagery (8241 images over 118 ha) with precise ground-truth labels and benchmarks two backbones, ResNet50 and DINOv2, on two image scales, revealing that small crops yield strong results (0.84 accuracy) and vision transformers excel with larger context (0.74 accuracy). The dataset demonstrates tractability but not full solvability for weed classification in wildlands and is released to advance ecology, remote sensing, and foundation-model research, including few-shot and unsupervised learning with unlabelled data. This resource supports development of targeted weed management and promotes exploration of multi-modal and foundation-model-driven approaches in real-world remote sensing tasks.
Abstract
Invasive plant species are detrimental to the ecology of both agricultural and wildland areas. Euphorbia esula, or leafy spurge, is one such plant that has spread through much of North America from Eastern Europe. When paired with contemporary computer vision systems, unmanned aerial vehicles, or drones, offer the means to track expansion of problem plants, such as leafy spurge, and improve chances of controlling these weeds. We gathered a dataset of leafy spurge presence and absence in grasslands of western Montana, USA, then surveyed these areas with a commercial drone. We trained image classifiers on these data, and our best performing model, a pre-trained DINOv2 vision transformer, identified leafy spurge with 0.84 accuracy (test set). This result indicates that classification of leafy spurge is tractable, but not solved. We release this unique dataset of labelled and unlabelled, aerial drone imagery for the machine learning community to explore. Improving classification performance of leafy spurge would benefit the fields of ecology, conservation, and remote sensing alike. Code and data are available at our website: leafy-spurge-dataset.github.io.
