6-DoF Grasp Pose Evaluation and Optimization via Transfer Learning from NeRFs
Gergely Sóti, Xi Huang, Christian Wurll, Björn Hein
TL;DR
This work introduces an implicit grasping framework that leverages a pretrained MVNeRF scene representation to evaluate 6-DoF grasp candidates using a learned scoring function trained from few demonstrations. Grasp poses are optimized via gradient ascent by maximizing the evaluation score, enabling generalization from simulated 4-DoF top-down grasps to 6-DoF grasps in both cluttered simulations and real-world environments without additional data. The MVNeRF backbone enables transfer of visual and geometric scene priors, and the approach is evaluated across simple, cluttered, and novel-object simulators plus real-world experiments, showing robust sim-to-real transfer and highlighting calibration sensitivity as a key real-world challenge. The work demonstrates the viability of NeRF-based implicit representations for real-time grasp planning, achieving competitive performance with limited training data and suggesting avenues for enhanced task grounding and planning. Overall, it advances data-efficient, geometry-aware grasping by uniting NeRF scene representations with implicit grasp evaluation and gradient-based optimization.
Abstract
We address the problem of robotic grasping of known and unknown objects using implicit behavior cloning. We train a grasp evaluation model from a small number of demonstrations that outputs higher values for grasp candidates that are more likely to succeed in grasping. This evaluation model serves as an objective function, that we maximize to identify successful grasps. Key to our approach is the utilization of learned implicit representations of visual and geometric features derived from a pre-trained NeRF. Though trained exclusively in a simulated environment with simplified objects and 4-DoF top-down grasps, our evaluation model and optimization procedure demonstrate generalization to 6-DoF grasps and novel objects both in simulation and in real-world settings, without the need for additional data. Supplementary material is available at: https://gergely-soti.github.io/grasp
