Gradient-based Local Next-best-view Planning for Improved Perception of Targeted Plant Nodes
Akshay K. Burusa, Eldert J. van Henten, Gert Kootstra
TL;DR
The paper tackles the challenge of accurately perceiving occluded plant nodes in greenhouses to enable targeted cutting and handling. It proposes a local NBV framework that uses a gradient-based optimisation with differentiable ray sampling to maximise semantic information about a single target node, integrated with a 4D voxel grid and Kalman-tracked node states. The approach achieves comparable reconstruction and pose accuracy to a sampling-based NBV planner but with an order-of-magnitude reduction in computations and substantially smoother, more efficient view trajectories. Validation in simulation with tomato plant models demonstrates the practical benefits for real-time perception and planning in crop robotics, with potential for extension to global planning in larger environments.
Abstract
Robots are increasingly used in tomato greenhouses to automate labour-intensive tasks such as selective harvesting and de-leafing. To perform these tasks, robots must be able to accurately and efficiently perceive the plant nodes that need to be cut, despite the high levels of occlusion from other plant parts. We formulate this problem as a local next-best-view (NBV) planning task where the robot has to plan an efficient set of camera viewpoints to overcome occlusion and improve the quality of perception. Our formulation focuses on quickly improving the perception accuracy of a single target node to maximise its chances of being cut. Previous methods of NBV planning mostly focused on global view planning and used random sampling of candidate viewpoints for exploration, which could suffer from high computational costs, ineffective view selection due to poor candidates, or non-smooth trajectories due to inefficient sampling. We propose a gradient-based NBV planner using differential ray sampling, which directly estimates the local gradient direction for viewpoint planning to overcome occlusion and improve perception. Through simulation experiments, we showed that our planner can handle occlusions and improve the 3D reconstruction and position estimation of nodes equally well as a sampling-based NBV planner, while taking ten times less computation and generating 28% more efficient trajectories.
