CenterGrasp: Object-Aware Implicit Representation Learning for Simultaneous Shape Reconstruction and 6-DoF Grasp Estimation
Eugenio Chisari, Nick Heppert, Tim Welschehold, Wolfram Burgard, Abhinav Valada
TL;DR
CenterGrasp tackles the problem of simultaneous 3D shape reconstruction and 6-DoF grasp estimation in clutter by making objects explicit through per-pixel heatmaps and latent codes, and by decoupling object-level geometry and grasps with a shape-and-grasp distance function (SGDF) decoder. The method combines an RGB-D image encoder with a per-object latent code and pose predictor and a per-object SGDF decoder to reconstruct full object shapes and a manifold of grasps, enabling holistic grasping that infers invisible regions. Trained entirely on synthetic data, CenterGrasp achieves strong zero-shot generalization to real scenes and outperforms the state-of-the-art GIGA on reconstruction and grasp metrics in simulation, with substantial improvements in real-robot experiments. The approach also demonstrates competitive performance on GraspNet-1Billion and provides a scalable, object-aware framework for robust scene understanding and grasping in cluttered environments, with code and models released publicly.
Abstract
Reliable object grasping is a crucial capability for autonomous robots. However, many existing grasping approaches focus on general clutter removal without explicitly modeling objects and thus only relying on the visible local geometry. We introduce CenterGrasp, a novel framework that combines object awareness and holistic grasping. CenterGrasp learns a general object prior by encoding shapes and valid grasps in a continuous latent space. It consists of an RGB-D image encoder that leverages recent advances to detect objects and infer their pose and latent code, and a decoder to predict shape and grasps for each object in the scene. We perform extensive experiments on simulated as well as real-world cluttered scenes and demonstrate strong scene reconstruction and 6-DoF grasp-pose estimation performance. Compared to the state of the art, CenterGrasp achieves an improvement of 38.5 mm in shape reconstruction and 33 percentage points on average in grasp success. We make the code and trained models publicly available at http://centergrasp.cs.uni-freiburg.de.
