Table of Contents
Fetching ...

FuncGrasp: Learning Object-Centric Neural Grasp Functions from Single Annotated Example Object

Hanzhi Chen, Binbin Xu, Stefan Leutenegger

TL;DR

FuncGrasp tackles dense grasp generation for unseen objects by transferring a continuous, object-centric grasp function from a single annotated exemplar using a surface-based neural representation (NSGF). It decouples geometry and grasp modeling by completing geometry on the surface and encoding grasps as a function over surface points, then uses unsupervised semantic primitives to enable cross-object transfer. The approach achieves higher grasp density and reliability than strong baselines in both simulation and real-world experiments, with substantial improvements in omni-grasp and best-grasp metrics. This framework enables data-efficient, scalable grasp transfer, though it relies on accurate geometry completion and simulator fidelity; future work aims at speedups and broader transfer robustness.

Abstract

We present FuncGrasp, a framework that can infer dense yet reliable grasp configurations for unseen objects using one annotated object and single-view RGB-D observation via categorical priors. Unlike previous works that only transfer a set of grasp poses, FuncGrasp aims to transfer infinite configurations parameterized by an object-centric continuous grasp function across varying instances. To ease the transfer process, we propose Neural Surface Grasping Fields (NSGF), an effective neural representation defined on the surface to densely encode grasp configurations. Further, we exploit function-to-function transfer using sphere primitives to establish semantically meaningful categorical correspondences, which are learned in an unsupervised fashion without any expert knowledge. We showcase the effectiveness through extensive experiments in both simulators and the real world. Remarkably, our framework significantly outperforms several strong baseline methods in terms of density and reliability for generated grasps.

FuncGrasp: Learning Object-Centric Neural Grasp Functions from Single Annotated Example Object

TL;DR

FuncGrasp tackles dense grasp generation for unseen objects by transferring a continuous, object-centric grasp function from a single annotated exemplar using a surface-based neural representation (NSGF). It decouples geometry and grasp modeling by completing geometry on the surface and encoding grasps as a function over surface points, then uses unsupervised semantic primitives to enable cross-object transfer. The approach achieves higher grasp density and reliability than strong baselines in both simulation and real-world experiments, with substantial improvements in omni-grasp and best-grasp metrics. This framework enables data-efficient, scalable grasp transfer, though it relies on accurate geometry completion and simulator fidelity; future work aims at speedups and broader transfer robustness.

Abstract

We present FuncGrasp, a framework that can infer dense yet reliable grasp configurations for unseen objects using one annotated object and single-view RGB-D observation via categorical priors. Unlike previous works that only transfer a set of grasp poses, FuncGrasp aims to transfer infinite configurations parameterized by an object-centric continuous grasp function across varying instances. To ease the transfer process, we propose Neural Surface Grasping Fields (NSGF), an effective neural representation defined on the surface to densely encode grasp configurations. Further, we exploit function-to-function transfer using sphere primitives to establish semantically meaningful categorical correspondences, which are learned in an unsupervised fashion without any expert knowledge. We showcase the effectiveness through extensive experiments in both simulators and the real world. Remarkably, our framework significantly outperforms several strong baseline methods in terms of density and reliability for generated grasps.
Paper Structure (13 sections, 4 equations, 5 figures, 1 table)

This paper contains 13 sections, 4 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Given a partial RGB-D input, our framework transfers a known object's continuous grasp function fitted from discrete annotations to the unseen object. We represent such a function using our proposed Neural Surface Grasping Fields (NSGF) formulation. This process is achieved by completing the object's geometry and estimating its semantic primitives learned in an unsupervised fashion. Using the transferred NSGF, the robot can query the dense dependable grasp knowledge embedded in a smooth function to conduct grasping from different configurations.
  • Figure 2: Illustration of our proposed framework, FuncGrasp. (A) Our geometric estimation module infers the target object's 7-DoF pose, completed shape, semantic primitives, and shape-aware confidence (red indicates low; blue indicates high). (B) Our Neural Surface Grasping Field (NSGF) formulation defines point-wise grasp configurations on the surface. (C) We approximate NSGF using semantic primitives. We sample a small number of grasp configurations for each primitive and transfer them to the target object using the corresponding primitive (grippers colored magenta, green, and cyan indicate three different primitives). After adjusting the transferred grasps based on the object's shape and filtering invalid samples in the simulator, we fit a new NSGF using the rest of the samples to achieve grasp function transfer.
  • Figure 3: (A) Raw antipodal contact point using predicted coarse width $w_\text{coarse}$ without geometric awareness, leading to grasp failure due to collision. (B) Use of completed geometry to search for nearby on-surface points. (C) Accurate antipodal contact point thanks to precise shape completion, yielding a collision-free grasp pose.
  • Figure 4: Qualitative results for each category tested in simulations. Source objects with grasp annotations are marked within the green dotted boxes. Their NSGFs are fitted with valid labels. For every unseen object, we visualize its ground-truth geometry (inaccessible for unseen objects), completed geometry inferred from partial pointcloud, semantic primitives, and the transferred NSGF. We highlight the semantic primitives-based correspondence among objects with orange dotted boxes and lines (zoom in for details). Note here we intentionally down-sample the grasps inferred by NSGF by a factor of 10 compared to the inference time so that the actual ground-truth geometry is visible.
  • Figure 5: (A) Physical setup for real robot experiments. (B) Tested objects, five instances per category. (C)-(G) Examples of successful grasps.