Table of Contents
Fetching ...

Learning to Turn: Diffusion Imitation for Robust Row Turning in Under-Canopy Robots

Arun N. Sivakumar, Pranay Thangeda, Yixiao Fang, Mateus V. Gasparino, Jose Cuaran, Melkior Ornik, Girish Chowdhary

TL;DR

This work tackles robust row turning for under-canopy robots where GPS and vision are unreliable. It introduces a diffusion-model-based imitation learning approach that learns row-turn policies from demonstrations, including recovery behaviors, using RGB observations and velocity states. A conditional diffusion policy (DDPM) is trained on 350 demonstrations (from human teleoperators and privileged MPC) in a high-fidelity simulator, and evaluated on left-turn and one-row-skipping tasks. The results demonstrate feasibility of diffusion policies for this control problem but reveal brittleness inside rows, pointing to future enhancements such as goal conditioning and real-world deployment for end-to-end row navigation.

Abstract

Under-canopy agricultural robots require robust navigation capabilities to enable full autonomy but struggle with tight row turning between crop rows due to degraded GPS reception, visual aliasing, occlusion, and complex vehicle dynamics. We propose an imitation learning approach using diffusion policies to learn row turning behaviors from demonstrations provided by human operators or privileged controllers. Simulation experiments in a corn field environment show potential in learning this task with only visual observations and velocity states. However, challenges remain in maintaining control within rows and handling varied initial conditions, highlighting areas for future improvement.

Learning to Turn: Diffusion Imitation for Robust Row Turning in Under-Canopy Robots

TL;DR

This work tackles robust row turning for under-canopy robots where GPS and vision are unreliable. It introduces a diffusion-model-based imitation learning approach that learns row-turn policies from demonstrations, including recovery behaviors, using RGB observations and velocity states. A conditional diffusion policy (DDPM) is trained on 350 demonstrations (from human teleoperators and privileged MPC) in a high-fidelity simulator, and evaluated on left-turn and one-row-skipping tasks. The results demonstrate feasibility of diffusion policies for this control problem but reveal brittleness inside rows, pointing to future enhancements such as goal conditioning and real-world deployment for end-to-end row navigation.

Abstract

Under-canopy agricultural robots require robust navigation capabilities to enable full autonomy but struggle with tight row turning between crop rows due to degraded GPS reception, visual aliasing, occlusion, and complex vehicle dynamics. We propose an imitation learning approach using diffusion policies to learn row turning behaviors from demonstrations provided by human operators or privileged controllers. Simulation experiments in a corn field environment show potential in learning this task with only visual observations and velocity states. However, challenges remain in maintaining control within rows and handling varied initial conditions, highlighting areas for future improvement.
Paper Structure (4 sections, 3 equations, 2 figures)

This paper contains 4 sections, 3 equations, 2 figures.

Figures (2)

  • Figure 1: Overview of the proposed method for learning row turning behaviors using diffusion policies. (Left) Demonstrations collected in the simulation environment using human teleoperation and procedurally generated demonstrations that utilize privileged information. (Right) Architecture that takes in RGB and robot state observation history and generates sequence of actions for execution.
  • Figure 2: Bird's eye view visualization of trajectories in a corn field. Rows of corn are represented using green solid lines. Each trajectory is represented by a unique color, with • indicating the starting point and $\times$ marking the end point. (Top) Sample expert demonstration trajectories collected for training the policy. Multiple trajectories showcase the diversity of paths used in the training data. (Bottom) Sample rollouts generated by the trained policy for various initial conditions, illustrating the policy's ability to produce trajectories similar to the demonstration data.