Table of Contents
Fetching ...

End-to-End Crop Row Navigation via LiDAR-Based Deep Reinforcement Learning

Ana Luiza Mineiro, Francisco Affonso, Marcelo Becker

TL;DR

This work tackles autonomous crop-row navigation under canopy where GNSS is unreliable by presenting an end-to-end reinforcement learning policy that maps raw 3D LiDAR data to steering commands, trained entirely in simulation. A voxel-based LiDAR downsampling pipeline converts 3D point clouds into compact 2D row maps, enabling learning from limited observations without labeled data or modular interfaces. The policy is trained with PPO on a POMDP-inspired framework that uses a history of LiDAR-derived row maps and a carefully designed reward to balance progress and safety. In simulation, the approach achieves 100% straight-row success and demonstrates robust generalization to curved rows within the training distribution, with performance gradually degrading as curvature increases, highlighting practical potential and the need for further real-world validation.

Abstract

Reliable navigation in under-canopy agricultural environments remains a challenge due to GNSS unreliability, cluttered rows, and variable lighting. To address these limitations, we present an end-to-end learning-based navigation system that maps raw 3D LiDAR data directly to control commands using a deep reinforcement learning policy trained entirely in simulation. Our method includes a voxel-based downsampling strategy that reduces LiDAR input size by 95.83%, enabling efficient policy learning without relying on labeled datasets or manually designed control interfaces. The policy was validated in simulation, achieving a 100% success rate in straight-row plantations and showing a gradual decline in performance as row curvature increased, tested across varying sinusoidal frequencies and amplitudes.

End-to-End Crop Row Navigation via LiDAR-Based Deep Reinforcement Learning

TL;DR

This work tackles autonomous crop-row navigation under canopy where GNSS is unreliable by presenting an end-to-end reinforcement learning policy that maps raw 3D LiDAR data to steering commands, trained entirely in simulation. A voxel-based LiDAR downsampling pipeline converts 3D point clouds into compact 2D row maps, enabling learning from limited observations without labeled data or modular interfaces. The policy is trained with PPO on a POMDP-inspired framework that uses a history of LiDAR-derived row maps and a carefully designed reward to balance progress and safety. In simulation, the approach achieves 100% straight-row success and demonstrates robust generalization to curved rows within the training distribution, with performance gradually degrading as curvature increases, highlighting practical potential and the need for further real-world validation.

Abstract

Reliable navigation in under-canopy agricultural environments remains a challenge due to GNSS unreliability, cluttered rows, and variable lighting. To address these limitations, we present an end-to-end learning-based navigation system that maps raw 3D LiDAR data directly to control commands using a deep reinforcement learning policy trained entirely in simulation. Our method includes a voxel-based downsampling strategy that reduces LiDAR input size by 95.83%, enabling efficient policy learning without relying on labeled datasets or manually designed control interfaces. The policy was validated in simulation, achieving a 100% success rate in straight-row plantations and showing a gradual decline in performance as row curvature increased, tested across varying sinusoidal frequencies and amplitudes.

Paper Structure

This paper contains 15 sections, 5 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: TerraSentia robot navigating in agricultural environments terrasentia2020.
  • Figure 2: Workflow of the end-to-end row-following navigation system. First, raw LiDAR point clouds captured from the environment are downsampled and transformed into 2D row maps using a transformation function $\phi$, reducing their dimensionality. A finite sequence of row maps is then stacked to construct an observation history, which serves as input to the policy network. Trained using a reinforcement learning algorithm, the policy generates actions that guide the robot to accurately follow crop rows while avoiding collisions.
  • Figure 3: Illustration of the proposed data downsampling method.
  • Figure 4: Representation of the downsampling method. Top: LiDAR point clouds captured over time. Bottom: corresponding row maps used as policy input.
  • Figure 5: Navigation trajectories of the trained policies bounded by plantation geometry. Top: Straight-row scenarios. Bottom: Curved-row scenarios.
  • ...and 3 more figures