Enhanced View Planning for Robotic Harvesting: Tackling Occlusions with Imitation Learning
Lun Li, Hamidreza Kasaei
TL;DR
This work tackles occlusion in robotic harvesting by introducing an end-to-end imitation-learning viewpoint planner that continuously adjusts a camera in 6-DoF to reveal occluded crops. The approach leverages Action Chunking with Transformer (ACT) to predict action chunks from RGB-D and pose-change observations, trained via behavior cloning on expert demonstrations collected in Gazebo. In simulation, the planner achieves an 86.7% success rate with rapid 3.1 s planning, and generalizes across eight fruit types; real-world tests yield 66.7% success due to clutter and lighting variations, demonstrating practical potential. Overall, the study provides a data-efficient, generalizable LfD solution for occlusion-aware view planning that enhances autonomous harvesting performance and productivity, with real-time closed-loop control at 10 Hz.
Abstract
In agricultural automation, inherent occlusion presents a major challenge for robotic harvesting. We propose a novel imitation learning-based viewpoint planning approach to actively adjust camera viewpoint and capture unobstructed images of the target crop. Traditional viewpoint planners and existing learning-based methods, depend on manually designed evaluation metrics or reward functions, often struggle to generalize to complex, unseen scenarios. Our method employs the Action Chunking with Transformer (ACT) algorithm to learn effective camera motion policies from expert demonstrations. This enables continuous six-degree-of-freedom (6-DoF) viewpoint adjustments that are smoother, more precise and reveal occluded targets. Extensive experiments in both simulated and real-world environments, featuring agricultural scenarios and a 6-DoF robot arm equipped with an RGB-D camera, demonstrate our method's superior success rate and efficiency, especially in complex occlusion conditions, as well as its ability to generalize across different crops without reprogramming. This study advances robotic harvesting by providing a practical "learn from demonstration" (LfD) solution to occlusion challenges, ultimately enhancing autonomous harvesting performance and productivity.
