Table of Contents
Fetching ...

A Dataset and Benchmark for Shape Completion of Fruits for Agricultural Robotics

Federico Magistri, Thomas Läbe, Elias Marks, Sumanth Nagulavancha, Yue Pan, Claus Smitt, Lasse Klingbeil, Michael Halstead, Heiner Kuhlmann, Chris McCool, Jens Behley, Cyrill Stachniss

TL;DR

This work introduces the first public dataset for 3D shape completion of fruits in agriculture, pairing RGB-D observations with high-precision LiDAR ground truth to recover complete fruit meshes under occlusion. The dataset covers lab and greenhouse settings, includes per-fruit segmentation masks and a principled registration pipeline to canonical fruit poses, and supports a hidden-test benchmark via CodaLab. It provides a PyTorch data loader and evaluation toolkit, plus baseline methods (CoRe, HoMa, T-CoRe) to establish a competitive benchmark. By enabling evaluation under a domain gap between lab and greenhouse data, the work aims to foster robust 3D perception and manipulation for autonomous fruit harvesting and phenotyping in real-world farms.

Abstract

As the world population is expected to reach 10 billion by 2050, our agricultural production system needs to double its productivity despite a decline of human workforce in the agricultural sector. Autonomous robotic systems are one promising pathway to increase productivity by taking over labor-intensive manual tasks like fruit picking. To be effective, such systems need to monitor and interact with plants and fruits precisely, which is challenging due to the cluttered nature of agricultural environments causing, for example, strong occlusions. Thus, being able to estimate the complete 3D shapes of objects in presence of occlusions is crucial for automating operations such as fruit harvesting. In this paper, we propose the first publicly available 3D shape completion dataset for agricultural vision systems. We provide an RGB-D dataset for estimating the 3D shape of fruits. Specifically, our dataset contains RGB-D frames of single sweet peppers in lab conditions but also in a commercial greenhouse. For each fruit, we additionally collected high-precision point clouds that we use as ground truth. For acquiring the ground truth shape, we developed a measuring process that allows us to record data of real sweet pepper plants, both in the lab and in the greenhouse with high precision, and determine the shape of the sensed fruits. We release our dataset, consisting of almost 7,000 RGB-D frames belonging to more than 100 different fruits. We provide segmented RGB-D frames, with camera intrinsics to easily obtain colored point clouds, together with the corresponding high-precision, occlusion-free point clouds obtained with a high-precision laser scanner. We additionally enable evaluation of shape completion approaches on a hidden test set through a public challenge on a benchmark server.

A Dataset and Benchmark for Shape Completion of Fruits for Agricultural Robotics

TL;DR

This work introduces the first public dataset for 3D shape completion of fruits in agriculture, pairing RGB-D observations with high-precision LiDAR ground truth to recover complete fruit meshes under occlusion. The dataset covers lab and greenhouse settings, includes per-fruit segmentation masks and a principled registration pipeline to canonical fruit poses, and supports a hidden-test benchmark via CodaLab. It provides a PyTorch data loader and evaluation toolkit, plus baseline methods (CoRe, HoMa, T-CoRe) to establish a competitive benchmark. By enabling evaluation under a domain gap between lab and greenhouse data, the work aims to foster robust 3D perception and manipulation for autonomous fruit harvesting and phenotyping in real-world farms.

Abstract

As the world population is expected to reach 10 billion by 2050, our agricultural production system needs to double its productivity despite a decline of human workforce in the agricultural sector. Autonomous robotic systems are one promising pathway to increase productivity by taking over labor-intensive manual tasks like fruit picking. To be effective, such systems need to monitor and interact with plants and fruits precisely, which is challenging due to the cluttered nature of agricultural environments causing, for example, strong occlusions. Thus, being able to estimate the complete 3D shapes of objects in presence of occlusions is crucial for automating operations such as fruit harvesting. In this paper, we propose the first publicly available 3D shape completion dataset for agricultural vision systems. We provide an RGB-D dataset for estimating the 3D shape of fruits. Specifically, our dataset contains RGB-D frames of single sweet peppers in lab conditions but also in a commercial greenhouse. For each fruit, we additionally collected high-precision point clouds that we use as ground truth. For acquiring the ground truth shape, we developed a measuring process that allows us to record data of real sweet pepper plants, both in the lab and in the greenhouse with high precision, and determine the shape of the sensed fruits. We release our dataset, consisting of almost 7,000 RGB-D frames belonging to more than 100 different fruits. We provide segmented RGB-D frames, with camera intrinsics to easily obtain colored point clouds, together with the corresponding high-precision, occlusion-free point clouds obtained with a high-precision laser scanner. We additionally enable evaluation of shape completion approaches on a hidden test set through a public challenge on a benchmark server.
Paper Structure (11 sections, 6 equations, 5 figures, 3 tables)

This paper contains 11 sections, 6 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: With our dataset, we tackle the problem of estimating the shape of fruits as a mesh (shown in grey) given a partial observation of an RGB-D sensor providing a colored point cloud. The estimated shape of the fruit is essential to allow safe grasping in situation with severe occlusions, like a greenhouse environment.
  • Figure 2: Registration procedure for the lab scenario. Given the TSDF-aligned RGB-D frames with the corresponding point cloud (on the left side) and the dense point cloud of the LiDAR (on the right side), we can estimate planes in each point cloud (shown in the middle). With an initial pose estimated via the extracted planes, we can register both point clouds using ICP automatically resulting in the final registration.
  • Figure 3: Measuring procedure to align greenhouse RGB-D frames with the corresponding ground truth point cloud generated by the LiDAR. Given two recordings with and without markers, we first align the photogrammetric point clouds via a transformation $\hbox{\boldmath$T$}$, which allows us to associate the sweet peppers without and with markers. Based on the markers and manually identified pins, we are able to determine the transformation of the scanned fruit in the RGB-D frame yielding the final registration.
  • Figure 4: We show few example of input point clouds (left) and ground truth point clouds (right). The shape completion tasks involve estimating a complete 3D mesh from a partial and noisy point cloud.
  • Figure 5: Qualitative example of our registation results in the lab (top) and greenhouse (bottom). Where we show the ground truth point cloud, in red, aligned with the corresponding RGB-D frames.