Table of Contents
Fetching ...

Generative AI-based Pipeline Architecture for Increasing Training Efficiency in Intelligent Weed Control Systems

Sourav Modak, Anthony Stein

TL;DR

This study presents a new approach for generating synthetic images to improve deep learning-based object detection models for intelligent weed control, integrating the Segment Anything Model for zero-shot domain adaptation with a text-to-image Stable Diffusion Model, enabling the creation of synthetic images that capture diverse real-world conditions.

Abstract

In automated crop protection tasks such as weed control, disease diagnosis, and pest monitoring, deep learning has demonstrated significant potential. However, these advanced models rely heavily on high-quality, diverse datasets, often limited and costly in agricultural settings. Traditional data augmentation can increase dataset volume but usually lacks the real-world variability needed for robust training. This study presents a new approach for generating synthetic images to improve deep learning-based object detection models for intelligent weed control. Our GenAI-based image generation pipeline integrates the Segment Anything Model (SAM) for zero-shot domain adaptation with a text-to-image Stable Diffusion Model, enabling the creation of synthetic images that capture diverse real-world conditions. We evaluate these synthetic datasets using lightweight YOLO models, measuring data efficiency with mAP50 and mAP50-95 scores across varying proportions of real and synthetic data. Notably, YOLO models trained on datasets with 10% synthetic and 90% real images generally demonstrate superior mAP50 and mAP50-95 scores compared to those trained solely on real images. This approach not only reduces dependence on extensive real-world datasets but also enhances predictive performance. The integration of this approach opens opportunities for achieving continual self-improvement of perception modules in intelligent technical systems.

Generative AI-based Pipeline Architecture for Increasing Training Efficiency in Intelligent Weed Control Systems

TL;DR

This study presents a new approach for generating synthetic images to improve deep learning-based object detection models for intelligent weed control, integrating the Segment Anything Model for zero-shot domain adaptation with a text-to-image Stable Diffusion Model, enabling the creation of synthetic images that capture diverse real-world conditions.

Abstract

In automated crop protection tasks such as weed control, disease diagnosis, and pest monitoring, deep learning has demonstrated significant potential. However, these advanced models rely heavily on high-quality, diverse datasets, often limited and costly in agricultural settings. Traditional data augmentation can increase dataset volume but usually lacks the real-world variability needed for robust training. This study presents a new approach for generating synthetic images to improve deep learning-based object detection models for intelligent weed control. Our GenAI-based image generation pipeline integrates the Segment Anything Model (SAM) for zero-shot domain adaptation with a text-to-image Stable Diffusion Model, enabling the creation of synthetic images that capture diverse real-world conditions. We evaluate these synthetic datasets using lightweight YOLO models, measuring data efficiency with mAP50 and mAP50-95 scores across varying proportions of real and synthetic data. Notably, YOLO models trained on datasets with 10% synthetic and 90% real images generally demonstrate superior mAP50 and mAP50-95 scores compared to those trained solely on real images. This approach not only reduces dependence on extensive real-world datasets but also enhances predictive performance. The integration of this approach opens opportunities for achieving continual self-improvement of perception modules in intelligent technical systems.

Paper Structure

This paper contains 16 sections, 6 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Synthetic object (here weed) generation pipeline. The upper half shows the dataset Transformation phase: Utilization of the foundation model SAM to convert object detection datasets into instance segmentation datasets (b). Weed classes are masked to eliminate complex backgrounds while preserving image integrity (c). Image Generation phase (lower half): Fine-tuning of a Stable Diffusion Model using weed masks and background images to facilitate text-guided image generation (d), and subsequent model-guided label generation (e). The last step (f) shows options to perform image quality assessment (IQA), such as image specific methods, including quantitative and qualitative metrics, and task specific approaches prescribed by the downstream task, i.e., mAP score for object detection
  • Figure 2: Sample pseudo-RGB images from the sugar beet dataset, captured from the described outdoor euro pallet setup. The dataset comprises sugar beet and weed classes (Cirsium, Convolvulus, Fallopia, and Echinochloa)
  • Figure 3: Synthetic images generated by fixed weed class mode, depicting different weed classes: (a) Cirsium, (b)Convolvulus, (c)Fallopia, (d)Echinochloa
  • Figure 4: Synthetic images created by random generation mode, depicting diversified plant and weed classes on the plots
  • Figure 5: Overview of the targeted random sub-sampling strategy used in this study. Ten independent dataset subsets were generated, each with defined proportions of synthetic (Syn) and real (Real) images. YOLO models were trained on the specific dataset combinations within each subset and evaluated against a fixed validation and test set comprising only real-world data, providing a robust and consistent framework for comparing model performance across varying synthetic-to-real training ratios
  • ...and 6 more figures