Table of Contents
Fetching ...

Gradient-based Local Next-best-view Planning for Improved Perception of Targeted Plant Nodes

Akshay K. Burusa, Eldert J. van Henten, Gert Kootstra

TL;DR

The paper tackles the challenge of accurately perceiving occluded plant nodes in greenhouses to enable targeted cutting and handling. It proposes a local NBV framework that uses a gradient-based optimisation with differentiable ray sampling to maximise semantic information about a single target node, integrated with a 4D voxel grid and Kalman-tracked node states. The approach achieves comparable reconstruction and pose accuracy to a sampling-based NBV planner but with an order-of-magnitude reduction in computations and substantially smoother, more efficient view trajectories. Validation in simulation with tomato plant models demonstrates the practical benefits for real-time perception and planning in crop robotics, with potential for extension to global planning in larger environments.

Abstract

Robots are increasingly used in tomato greenhouses to automate labour-intensive tasks such as selective harvesting and de-leafing. To perform these tasks, robots must be able to accurately and efficiently perceive the plant nodes that need to be cut, despite the high levels of occlusion from other plant parts. We formulate this problem as a local next-best-view (NBV) planning task where the robot has to plan an efficient set of camera viewpoints to overcome occlusion and improve the quality of perception. Our formulation focuses on quickly improving the perception accuracy of a single target node to maximise its chances of being cut. Previous methods of NBV planning mostly focused on global view planning and used random sampling of candidate viewpoints for exploration, which could suffer from high computational costs, ineffective view selection due to poor candidates, or non-smooth trajectories due to inefficient sampling. We propose a gradient-based NBV planner using differential ray sampling, which directly estimates the local gradient direction for viewpoint planning to overcome occlusion and improve perception. Through simulation experiments, we showed that our planner can handle occlusions and improve the 3D reconstruction and position estimation of nodes equally well as a sampling-based NBV planner, while taking ten times less computation and generating 28% more efficient trajectories.

Gradient-based Local Next-best-view Planning for Improved Perception of Targeted Plant Nodes

TL;DR

The paper tackles the challenge of accurately perceiving occluded plant nodes in greenhouses to enable targeted cutting and handling. It proposes a local NBV framework that uses a gradient-based optimisation with differentiable ray sampling to maximise semantic information about a single target node, integrated with a 4D voxel grid and Kalman-tracked node states. The approach achieves comparable reconstruction and pose accuracy to a sampling-based NBV planner but with an order-of-magnitude reduction in computations and substantially smoother, more efficient view trajectories. Validation in simulation with tomato plant models demonstrates the practical benefits for real-time perception and planning in crop robotics, with potential for extension to global planning in larger environments.

Abstract

Robots are increasingly used in tomato greenhouses to automate labour-intensive tasks such as selective harvesting and de-leafing. To perform these tasks, robots must be able to accurately and efficiently perceive the plant nodes that need to be cut, despite the high levels of occlusion from other plant parts. We formulate this problem as a local next-best-view (NBV) planning task where the robot has to plan an efficient set of camera viewpoints to overcome occlusion and improve the quality of perception. Our formulation focuses on quickly improving the perception accuracy of a single target node to maximise its chances of being cut. Previous methods of NBV planning mostly focused on global view planning and used random sampling of candidate viewpoints for exploration, which could suffer from high computational costs, ineffective view selection due to poor candidates, or non-smooth trajectories due to inefficient sampling. We propose a gradient-based NBV planner using differential ray sampling, which directly estimates the local gradient direction for viewpoint planning to overcome occlusion and improve perception. Through simulation experiments, we showed that our planner can handle occlusions and improve the 3D reconstruction and position estimation of nodes equally well as a sampling-based NBV planner, while taking ten times less computation and generating 28% more efficient trajectories.
Paper Structure (36 sections, 7 equations, 3 figures, 2 tables)

This paper contains 36 sections, 7 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The pipeline of our proposed gradient-based NBV method. The nodes are detected in the color image using Mask R-CNN. The resulting segmented image, together with depth image, is inserted into the voxel grid to merge information from multiple views. The utility of the current view is computed using differentiable ray sampling, which provides a gradient along which the camera is moved.
  • Figure 2: Qualitative analysis of the trajectories generated by the viewpoint planners. The blue axis shows the viewing direction of the camera. The start and end viewpoints are marked in pink and yellow respectively.
  • Figure 3: Example of three consecutive views from GradientNBV. The top row shows the segmented images and the bottom row shows the viewpoint utility rendered from the current view (yellow: high, blue: low). We can observe that the nodes were detected in View 3 as the camera moved closer. The utility of the viewpoints within the ROI reduced with each viewpoint.