Table of Contents
Fetching ...

GeoMatch++: Morphology Conditioned Geometry Matching for Multi-Embodiment Grasping

Yunze Wei, Maria Attarian, Igor Gilitschenski

TL;DR

The paper addresses the challenge of generalizing dexterous grasping to unseen gripper morphologies. It introduces GeoMatch++, a morphology-conditioned framework that fuses object, gripper, and morphology graphs via GCNs and transformer-based attention to learn geometry correlations and autoregressively predict contact points. Empirical results show a substantial out-of-domain improvement (mean ~71.7% success and 0.257 diversity) over prior methods, with an average gain of about 9.64 percentage points, while maintaining close in-domain performance. The work demonstrates that incorporating detailed morphology as a graph-structured inductive bias enables more robust zero-shot generalization to new end-effectors, moving toward versatile, real-world dexterous manipulation.

Abstract

Despite recent progress on multi-finger dexterous grasping, current methods focus on single grippers and unseen objects, and even the ones that explore cross-embodiment, often fail to generalize well to unseen end-effectors. This work addresses the problem of dexterous grasping generalization to unseen end-effectors via a unified policy that learns correlation between gripper morphology and object geometry. Robot morphology contains rich information representing how joints and links connect and move with respect to each other and thus, we leverage it through attention to learn better end-effector geometry features. Our experiments show an average of 9.64% increase in grasp success rate across 3 out-of-domain end-effectors compared to previous methods.

GeoMatch++: Morphology Conditioned Geometry Matching for Multi-Embodiment Grasping

TL;DR

The paper addresses the challenge of generalizing dexterous grasping to unseen gripper morphologies. It introduces GeoMatch++, a morphology-conditioned framework that fuses object, gripper, and morphology graphs via GCNs and transformer-based attention to learn geometry correlations and autoregressively predict contact points. Empirical results show a substantial out-of-domain improvement (mean ~71.7% success and 0.257 diversity) over prior methods, with an average gain of about 9.64 percentage points, while maintaining close in-domain performance. The work demonstrates that incorporating detailed morphology as a graph-structured inductive bias enables more robust zero-shot generalization to new end-effectors, moving toward versatile, real-world dexterous manipulation.

Abstract

Despite recent progress on multi-finger dexterous grasping, current methods focus on single grippers and unseen objects, and even the ones that explore cross-embodiment, often fail to generalize well to unseen end-effectors. This work addresses the problem of dexterous grasping generalization to unseen end-effectors via a unified policy that learns correlation between gripper morphology and object geometry. Robot morphology contains rich information representing how joints and links connect and move with respect to each other and thus, we leverage it through attention to learn better end-effector geometry features. Our experiments show an average of 9.64% increase in grasp success rate across 3 out-of-domain end-effectors compared to previous methods.

Paper Structure

This paper contains 18 sections, 1 equation, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Sample morphology graph for Barrett hand with labelled keypoints.
  • Figure 2: Architecture for GeoMatch++. GCNs learn latent features for object and gripper point clouds, and end-effector morphology. Features are passed into transformer modules to learn the object-gripper correspondence. Autoregressive matching predicts final contact points using MLP layers.
  • Figure 3: Qualitative grasp results on unseen grippers.