WeedsGalore: A Multispectral and Multitemporal UAV-based Dataset for Crop and Weed Segmentation in Agricultural Maize Fields
Ekin Celikkan, Timo Kunzmann, Yertay Yeskaliyev, Sibylle Itzerott, Nadja Klein, Martin Herold
TL;DR
Weed management in maize is hindered by limited, domain-specific, annotated data for UAV-based weed segmentation. We present WeedsGalore, a multispectral, multitemporal UAV dataset with five bands and dense semantic and instance annotations for crops and four weed classes, tailored to maize fields. Baseline experiments with DeepLabv3+ and MaskFormer show that multispectral information improves segmentation, especially for challenging weed classes, and probabilistic inference with MC dropout provides reliable uncertainty estimates and better calibration. The dataset's generalization to unseen data is demonstrated on Maize2024, where models trained on WeedsGalore achieve substantial improvement over baselines, underscoring the practical value for UAV-based weeding and monitoring systems; the dataset and code are publicly available at https://github.com/GFZ/weedsgalore.
Abstract
Weeds are one of the major reasons for crop yield loss but current weeding practices fail to manage weeds in an efficient and targeted manner. Effective weed management is especially important for crops with high worldwide production such as maize, to maximize crop yield for meeting increasing global demands. Advances in near-sensing and computer vision enable the development of new tools for weed management. Specifically, state-of-the-art segmentation models, coupled with novel sensing technologies, can facilitate timely and accurate weeding and monitoring systems. However, learning-based approaches require annotated data and show a lack of generalization to aerial imaging for different crops. We present a novel dataset for semantic and instance segmentation of crops and weeds in agricultural maize fields. The multispectral UAV-based dataset contains images with RGB, red-edge, and near-infrared bands, a large number of plant instances, dense annotations for maize and four weed classes, and is multitemporal. We provide extensive baseline results for both tasks, including probabilistic methods to quantify prediction uncertainty, improve model calibration, and demonstrate the approach's applicability to out-of-distribution data. The results show the effectiveness of the two additional bands compared to RGB only, and better performance in our target domain than models trained on existing datasets. We hope our dataset advances research on methods and operational systems for fine-grained weed identification, enhancing the robustness and applicability of UAV-based weed management. The dataset and code are available at https://github.com/GFZ/weedsgalore
