RGBGrasp: Image-based Object Grasping by Capturing Multiple Views during Robot Arm Movement with Neural Radiance Fields
Chang Liu, Kejian Shi, Kaichen Zhou, Haoxiao Wang, Jiyao Zhang, Hao Dong
TL;DR
RGBGrasp tackles robust 3D perception for robotic grasping under limited RGB views by integrating monocular depth priors with neural radiance fields. The method operates with an eye-on-hand camera to accumulate views during manipulation, and introduces a depth rank loss, hash encoding, and a proposal sampler to accelerate NeRF-based scene reconstruction. A downstream grasp detector then computes 6-DoF poses from the reconstructed point cloud, enabling real-time grasping. Across simulation and real-robot experiments, RGBGrasp demonstrates strong performance on diffuse, transparent, and specular objects, outperforming several RGB- and RGB-D-based baselines while reducing training and inference time.
Abstract
Robotic research encounters a significant hurdle when it comes to the intricate task of grasping objects that come in various shapes, materials, and textures. Unlike many prior investigations that heavily leaned on specialized point-cloud cameras or abundant RGB visual data to gather 3D insights for object-grasping missions, this paper introduces a pioneering approach called RGBGrasp. This method depends on a limited set of RGB views to perceive the 3D surroundings containing transparent and specular objects and achieve accurate grasping. Our method utilizes pre-trained depth prediction models to establish geometry constraints, enabling precise 3D structure estimation, even under limited view conditions. Finally, we integrate hash encoding and a proposal sampler strategy to significantly accelerate the 3D reconstruction process. These innovations significantly enhance the adaptability and effectiveness of our algorithm in real-world scenarios. Through comprehensive experimental validations, we demonstrate that RGBGrasp achieves remarkable success across a wide spectrum of object-grasping scenarios, establishing it as a promising solution for real-world robotic manipulation tasks. The demonstrations of our method can be found on: https://sites.google.com/view/rgbgrasp
