Table of Contents
Fetching ...

AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation

Arun N. Sivakumar, Federico Magistri, Mateus V. Gasparino, Jens Behley, Cyrill Stachniss, Girish Chowdhary

TL;DR

Preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using this method.

Abstract

Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks throughout the growing season. Autonomous navigation under the canopy is challenging due to the degradation in accuracy of RTK-GPS and the large variability in the visual appearance of the scene over time. In prior work, we developed a supervised learning-based perception system with semantic keypoint representation and deployed this in various field conditions. A large number of failures of this system can be attributed to the inability of the perception model to adapt to the domain shift encountered during deployment. In this paper, we propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. Our preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using our method. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.

AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation

TL;DR

Preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using this method.

Abstract

Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks throughout the growing season. Autonomous navigation under the canopy is challenging due to the degradation in accuracy of RTK-GPS and the large variability in the visual appearance of the scene over time. In prior work, we developed a supervised learning-based perception system with semantic keypoint representation and deployed this in various field conditions. A large number of failures of this system can be attributed to the inability of the perception model to adapt to the domain shift encountered during deployment. In this paper, we propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. Our preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using our method. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.

Paper Structure

This paper contains 11 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Under-canopy robots navigating with machine learning-based systems encounter domain shifts due to variations in this environment. We propose a self-supervised online learning method to adapt to this.
  • Figure 2: AdaCropFollow overview: We train the keypoint prediction model in the source domain (early season corn) with labeled data. During online adaptation, we first adapt the vanishing point using stereo disparity loss and then adapt the remaining two intercept keypoints using pseudo-labeling guided by geometric prior. Note that we use the frozen feature encoder from Dinov2 during the source domain as well as online adaptation.
  • Figure 3: We visualize the pixel locations predicted by the model before and after adaptation in three different target environments namely green late-season corn, brown late-season corn, and orchard environment. We can see considerable improvement in accuracy of predicted pixel location of keypoints after adaptation.