EuroCropsML: A Time Series Benchmark Dataset For Few-Shot Crop Type Classification
Joana Reuss, Jan Macdonald, Simon Becker, Lorenz Richter, Marco Körner
TL;DR
EuroCropsML tackles the need for a transnational, few-shot capable benchmark for crop-type classification by introducing a time-series dataset built from 2021 Sentinel-2 L1C observations across three European ROIs and harmonized crop taxonomy. The authors present a two-stage data pipeline (data acquisition and pre-processing) to produce raw and ready-to-use ML data, including cloud-removal and per-parcel median band statistics across time steps. They define transfer-learning benchmarking scenarios with eight few-shot settings and demonstrate baseline experiments using a transformer-encoder, highlighting the value of region-specific pre-training for cross-region generalization. The dataset is openly available on Zenodo with an accompanying eurocropsml Python package that supports acquisition, processing, and benchmark configuration, enabling reproducible cross-region crop-type classification research with ready-made splits and configurable experiments.
Abstract
We introduce EuroCropsML, an analysis-ready remote sensing machine learning dataset for time series crop type classification of agricultural parcels in Europe. It is the first dataset designed to benchmark transnational few-shot crop type classification algorithms that supports advancements in algorithmic development and research comparability. It comprises 706 683 multi-class labeled data points across 176 classes, featuring annual time series of per-parcel median pixel values from Sentinel-2 L1C data for 2021, along with crop type labels and spatial coordinates. Based on the open-source EuroCrops collection, EuroCropsML is publicly available on Zenodo.
