Table of Contents
Fetching ...

Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation

Simone Angarano, Mauro Martini, Alessandro Navone, Marcello Chiaberge

TL;DR

The paper addresses domain generalization in crop segmentation by introducing a standardized ensemble knowledge distillation framework that transfers domain-specific expertise from multiple teachers to a single student. It formalizes DG for semantic segmentation, leverages an ensemble of domain-tuned teachers, and applies logit standardization to balance their influence, enabling robust generalization to unseen crops and conditions. A new synthetic multi-domain dataset, AgriSeg, with over 70,000 samples across 11 crops, underpins extensive evaluations, showing consistent improvements over state-of-the-art DG methods and strong sim-to-real transfer. The approach runs with a lightweight architecture (MobileNetV3 + LR-ASPP) and requires no extra test-time overhead, contributing a practical solution for real-time perception in precision agriculture and setting a benchmark for DG in crop segmentation.

Abstract

In recent years, precision agriculture has gradually oriented farming closer to automation processes to support all the activities related to field management. Service robotics plays a predominant role in this evolution by deploying autonomous agents that can navigate fields while performing tasks such as monitoring, spraying, and harvesting without human intervention. To execute these precise actions, mobile robots need a real-time perception system that understands their surroundings and identifies their targets in the wild. Existing methods, however, often fall short in generalizing to new crops and environmental conditions. This limit is critical for practical applications where labeled samples are rarely available. In this paper, we investigate the problem of crop segmentation and propose a novel approach to enhance domain generalization using knowledge distillation. In the proposed framework, we transfer knowledge from a standardized ensemble of models individually trained on source domains to a student model that can adapt to unseen realistic scenarios. To support the proposed method, we present a synthetic multi-domain dataset for crop segmentation containing plants of variegate species and covering different terrain styles, weather conditions, and light scenarios for more than 70,000 samples. We demonstrate significant improvements in performance over state-of-the-art methods and superior sim-to-real generalization. Our approach provides a promising solution for domain generalization in crop segmentation and has the potential to enhance a wide variety of agriculture applications.

Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation

TL;DR

The paper addresses domain generalization in crop segmentation by introducing a standardized ensemble knowledge distillation framework that transfers domain-specific expertise from multiple teachers to a single student. It formalizes DG for semantic segmentation, leverages an ensemble of domain-tuned teachers, and applies logit standardization to balance their influence, enabling robust generalization to unseen crops and conditions. A new synthetic multi-domain dataset, AgriSeg, with over 70,000 samples across 11 crops, underpins extensive evaluations, showing consistent improvements over state-of-the-art DG methods and strong sim-to-real transfer. The approach runs with a lightweight architecture (MobileNetV3 + LR-ASPP) and requires no extra test-time overhead, contributing a practical solution for real-time perception in precision agriculture and setting a benchmark for DG in crop segmentation.

Abstract

In recent years, precision agriculture has gradually oriented farming closer to automation processes to support all the activities related to field management. Service robotics plays a predominant role in this evolution by deploying autonomous agents that can navigate fields while performing tasks such as monitoring, spraying, and harvesting without human intervention. To execute these precise actions, mobile robots need a real-time perception system that understands their surroundings and identifies their targets in the wild. Existing methods, however, often fall short in generalizing to new crops and environmental conditions. This limit is critical for practical applications where labeled samples are rarely available. In this paper, we investigate the problem of crop segmentation and propose a novel approach to enhance domain generalization using knowledge distillation. In the proposed framework, we transfer knowledge from a standardized ensemble of models individually trained on source domains to a student model that can adapt to unseen realistic scenarios. To support the proposed method, we present a synthetic multi-domain dataset for crop segmentation containing plants of variegate species and covering different terrain styles, weather conditions, and light scenarios for more than 70,000 samples. We demonstrate significant improvements in performance over state-of-the-art methods and superior sim-to-real generalization. Our approach provides a promising solution for domain generalization in crop segmentation and has the potential to enhance a wide variety of agriculture applications.
Paper Structure (16 sections, 8 equations, 4 figures, 4 tables)

This paper contains 16 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Schematic representation of the proposed distillation methodology for crop segmentation. Ensembled specialized teachers allow the student to obtain a standardized distillation mask ($\tilde{y}^T$) that is much more informative than the hard label ($y$) for robust student training. $\mu_s$ represents standardized ensembling.
  • Figure 2: From left to right: examples of synthetic 3D crop models used to build the AgriSeg Dataset (generic tree, zucchini, lettuce, vineyard); examples of resulting dataset images (vineyard, chard); examples of real-world test images (vineyard, miscellaneous).
  • Figure 3: Comparison of ERM predictions with our ensemble of specialized teachers. While for simpler domains, the predictions of the specialized teachers agree and return a high-confidence mask, for challenging ones, the teachers give an uncertain but more informative mask that can be distilled into the student.
  • Figure 4: Ablation study on the hyper-parameters $\lambda$ and $\tau$. The reported IoU value is relative to the Real Miscellaneous domain and is averaged on three runs. We represent two views of the results for better readability.