Table of Contents
Fetching ...

From Web Data to Real Fields: Low-Cost Unsupervised Domain Adaptation for Agricultural Robots

Vasileios Tzouras, Lazaros Nalpantidis, Ronja Güldenring

TL;DR

A novel domain shift from a diverse, large pool of internet-sourced data to a small set of data collected by a robot at specific locations, minimizing the need for extensive on-field data collection is explored.

Abstract

In precision agriculture, vision models often struggle with new, unseen fields where crops and weeds have been influenced by external factors, resulting in compositions and appearances that differ from the learned distribution. This paper aims to adapt to specific fields at low cost using Unsupervised Domain Adaptation (UDA). We explore a novel domain shift from a diverse, large pool of internet-sourced data to a small set of data collected by a robot at specific locations, minimizing the need for extensive on-field data collection. Additionally, we introduce a novel module -- the Multi-level Attention-based Adversarial Discriminator (MAAD) -- which can be integrated at the feature extractor level of any detection model. In this study, we incorporate MAAD with CenterNet to simultaneously detect leaf, stem, and vein instances. Our results show significant performance improvements in the unlabeled target domain compared to baseline models, with a 7.5% increase in object detection accuracy and a 5.1% improvement in keypoint detection.

From Web Data to Real Fields: Low-Cost Unsupervised Domain Adaptation for Agricultural Robots

TL;DR

A novel domain shift from a diverse, large pool of internet-sourced data to a small set of data collected by a robot at specific locations, minimizing the need for extensive on-field data collection is explored.

Abstract

In precision agriculture, vision models often struggle with new, unseen fields where crops and weeds have been influenced by external factors, resulting in compositions and appearances that differ from the learned distribution. This paper aims to adapt to specific fields at low cost using Unsupervised Domain Adaptation (UDA). We explore a novel domain shift from a diverse, large pool of internet-sourced data to a small set of data collected by a robot at specific locations, minimizing the need for extensive on-field data collection. Additionally, we introduce a novel module -- the Multi-level Attention-based Adversarial Discriminator (MAAD) -- which can be integrated at the feature extractor level of any detection model. In this study, we incorporate MAAD with CenterNet to simultaneously detect leaf, stem, and vein instances. Our results show significant performance improvements in the unlabeled target domain compared to baseline models, with a 7.5% increase in object detection accuracy and a 5.1% improvement in keypoint detection.

Paper Structure

This paper contains 13 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: We explore a novel domain shift: from internet-sourced data from all over the world (source domain) to images that have been collected with an agricultural robot platform at very specific locations (target domain). The source domain provides a large number of images with a wide variety of backgrounds and plant appearances. The data can be easily obtained without relying on agricultural vehicles and growing seasons. Finally, we introduce a Multi-level Attention-based Adversarial Discriminator (MAAD) to adapt the detection model to the target domain, which is represented by a small set of images collected at a specific location and with fixed camera settings.
  • Figure 2: Our proposed Multi-level Attention-based Adversarial Discriminator (MAAD), shown in pink, is integrated with the feature extractor of the detection model. While the detection model is trained on source data only, MAAD processes both low- and high-level features of the source (orange) and target (blue) domains through the low-level AAD (LAAD) and high-level AAD (HAAD). The Attention-based Adversarial Discriminator (AAD) applies spatial attention to focus on informative regions, followed by a Gradient Reversal Layer (GRL) and a domain classifier to differentiate between the two domains. MAAD enforces the feature extractor to learn domain-invariant features, which are then processed to predict labels.
  • Figure 3: Qualitative differences between the source (iNaturalist, top) and target (RoboRumex, bottom) domains. The iNaturalist images provide a variety of different background settings, such as mud, grass, dry leaves, stones, and a wide variety of plant sizes ranging from 2 to 25 leaves. On the contrary, RoboRumex data provides a more uniform appearance with plants of similar sizes and mainly grass as background.
  • Figure 4: Qualitative results on the RoboRumex test set: (a) input images, (b) ground truth, (c) predictions without UDA, and (d) predictions with UDA using our proposed MAAD. Blue and white arrows highlight improvements of MAAD compared to the baseline for OBB and keypoint detection, respectively. Orange arrows indicate FPs and small leaves that were not detected compared to the ground truth.