Table of Contents
Fetching ...

Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding

George Retsinas, Niki Efthymiou, Petros Maragos

TL;DR

This work tackles the challenge of segmenting mushrooms and estimating their 3D pose from point clouds under limited 3D annotations. It introduces a synthetic mushroom-scene dataset and an implicit pose encoding built on a sparse 3D FCGF backbone, enabling per-point predictions for segmentation and pose without explicit per-instance labels. The approach uses a combination of pole-point existence, center residuals, and orientation cues, together with clustering to yield instance segmentation and an ellipsoid-based pose estimation that leverages orientation signals. Experimental results on synthetic data show strong detection and pose-estimation performance, while qualitative real-data results suggest promising synthetic-to-real transfer; code is released for public use.

Abstract

Modern agricultural applications rely more and more on deep learning solutions. However, training well-performing deep networks requires a large amount of annotated data that may not be available and in the case of 3D annotation may not even be feasible for human annotators. In this work, we develop a deep learning approach to segment mushrooms and estimate their pose on 3D data, in the form of point clouds acquired by depth sensors. To circumvent the annotation problem, we create a synthetic dataset of mushroom scenes, where we are fully aware of 3D information, such as the pose of each mushroom. The proposed network has a fully convolutional backbone, that parses sparse 3D data, and predicts pose information that implicitly defines both instance segmentation and pose estimation task. We have validated the effectiveness of the proposed implicit-based approach for a synthetic test set, as well as provided qualitative results for a small set of real acquired point clouds with depth sensors. Code is publicly available at https://github.com/georgeretsi/mushroom-pose.

Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding

TL;DR

This work tackles the challenge of segmenting mushrooms and estimating their 3D pose from point clouds under limited 3D annotations. It introduces a synthetic mushroom-scene dataset and an implicit pose encoding built on a sparse 3D FCGF backbone, enabling per-point predictions for segmentation and pose without explicit per-instance labels. The approach uses a combination of pole-point existence, center residuals, and orientation cues, together with clustering to yield instance segmentation and an ellipsoid-based pose estimation that leverages orientation signals. Experimental results on synthetic data show strong detection and pose-estimation performance, while qualitative real-data results suggest promising synthetic-to-real transfer; code is released for public use.

Abstract

Modern agricultural applications rely more and more on deep learning solutions. However, training well-performing deep networks requires a large amount of annotated data that may not be available and in the case of 3D annotation may not even be feasible for human annotators. In this work, we develop a deep learning approach to segment mushrooms and estimate their pose on 3D data, in the form of point clouds acquired by depth sensors. To circumvent the annotation problem, we create a synthetic dataset of mushroom scenes, where we are fully aware of 3D information, such as the pose of each mushroom. The proposed network has a fully convolutional backbone, that parses sparse 3D data, and predicts pose information that implicitly defines both instance segmentation and pose estimation task. We have validated the effectiveness of the proposed implicit-based approach for a synthetic test set, as well as provided qualitative results for a small set of real acquired point clouds with depth sensors. Code is publicly available at https://github.com/georgeretsi/mushroom-pose.
Paper Structure (11 sections, 3 equations, 6 figures, 4 tables)

This paper contains 11 sections, 3 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: 3D mesh of the mushroom template.
  • Figure 2: Generated scenes of synthetic point clouds. Distractors are clearly visible in the first two images. Mushrooms regions are highlighted with red color.
  • Figure 3: Overview of the proposed system. Given a point cloud input of a mushroom scene, the proposed deep network predicts the three categories of task-relevant information. Using a mode-seeking clustering over the predicted centers we can provide the instance segmentation result. Then each mushroom region is processed as an ellipsoid structure and the corresponding 3D pose is estimated.
  • Figure 4: The residual center information represented as translation vector (top) and the corresponding transformation that creates dense regions around mushroom centers (bottom). Note how the mushroom points in the top image are directed towards the center, while background points have smaller random displacements.
  • Figure 5: Qualitative examples of segmentation and pose estimation: first row corresponds to data acquired from a setup of two depth RGB-D cameras, while second row correspond to multi-view data of 18 view from a rotating camera system. Top-right image is a single view example. We use a red oriented bounding box to denote the pose of each mushroom. Zoom in for details.
  • ...and 1 more figures