Vision Foundation Models in Agriculture: Toward Domain-Specific Adaptation for Weed Herbicide Trials Assessment
Leire Benito-Del-Valle, Artzai Picón, Daniel Mugica, Manuel Ramos, Eva Portillo, Javier Romero, Carlos Javier Jimenez, Ramón Navarra-Mestre
TL;DR
This work develops a domain-specific vision foundation model for herbicide trials by applying self-supervised pretraining (DINOv2) on a large, curated agricultural dataset and then fine-tuning with SegFormer decoders for vegetation, species, and damage segmentation. It demonstrates clear performance gains over general-purpose foundation models, especially under domain shifts and with limited labeled data, achieving higher F1 scores for both species identification and damage classification on BASE, REALITY, and DRONE datasets. The results highlight improved generalization to unseen environments and reduced annotation needs, suggesting practical scalability for automated herbicide-trial analysis. The study also discusses computational considerations and future directions (e.g., LoRA) to further enhance efficiency and applicability across broader agricultural tasks.
Abstract
Herbicide field trials require accurate identification of plant species and assessment of herbicide-induced damage across diverse environments. While general-purpose vision foundation models have shown promising results in complex visual domains, their performance can be limited in agriculture, where fine-grained distinctions between species and damage types are critical. In this work, we adapt a general-purpose vision foundation model to herbicide trial characterization. Trained using a self-supervised learning approach on a large, curated agricultural dataset, the model learns rich and transferable representations optimized for herbicide trials images. Our domain-specific model significantly outperforms the best general-purpose foundation model in both species identification (F1 score improvement from 0.91 to 0.94) and damage classification (from 0.26 to 0.33). Under unseen conditions (new locations and other time), it achieves even greater gains (species identification from 0.56 to 0.66; damage classification from 0.17 to 0.27). In domain-shift scenarios, such as drone imagery, it maintains strong performance (species classification from 0.49 to 0.60). Additionally, we show that domain-specific pretraining enhances segmentation accuracy, particularly in low-annotation regimes. An annotation-efficiency analysis reveals that, under unseen conditions, the domain-specific model achieves 5.4% higher F1 score than the general-purpose model, while using 80% fewer labeled samples. These results demonstrate the generalization capabilities of domain-specific foundation models and their potential to significantly reduce manual annotation efforts, offering a scalable and automated solution for herbicide trial analysis.
