RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning
Yuxuan Wu, Lei Pan, Wenhua Wu, Guangming Wang, Yanzi Miao, Fan Xu, Hesheng Wang
TL;DR
This work tackles the sim-to-real gap in vision-based robotic manipulation by introducing RL-GSBridge, a Real2Sim2Real framework that leverages 3D Gaussian Splatting to construct realistic, editable scene representations from real imagery. It combines a soft mesh binding GS model for accurate geometry and texture with physics-informed GS editing to synchronize visuals with dynamics, enabling zero-shot transfer from simulation to real robots. The method uses PyBullet for dynamics training with SAC (and SACwB guidance) and a GS renderer to produce realistic observations, then transfers directly to real hardware without fine-tuning. Experiments on grasping and pick-and-place tasks demonstrate robust sim-to-real transfer and high rendering fidelity, with fewer artifacts than hard-binding GS methods and strong behavior consistency between sim and real environments.
Abstract
Sim-to-Real refers to the process of transferring policies learned in simulation to the real world, which is crucial for achieving practical robotics applications. However, recent Sim2real methods either rely on a large amount of augmented data or large learning models, which is inefficient for specific tasks. In recent years, with the emergence of radiance field reconstruction methods, especially 3D Gaussian splatting, it has become possible to construct realistic real-world scenes. To this end, we propose RL-GSBridge, a novel real-to-sim-to-real framework which incorporates 3D Gaussian Splatting into the conventional RL simulation pipeline, enabling zero-shot sim-to-real transfer for vision-based deep reinforcement learning. We introduce a mesh-based 3D GS method with soft binding constraints, enhancing the rendering quality of mesh models. Then utilizing a GS editing approach to synchronize the rendering with the physics simulator, RL-GSBridge could reflect the visual interactions of the physical robot accurately. Through a series of sim-to-real experiments, including grasping and pick-and-place tasks, we demonstrate that RL-GSBridge maintains a satisfactory success rate in real-world task completion during sim-to-real transfer. Furthermore, a series of rendering metrics and visualization results indicate that our proposed mesh-based 3D GS reduces artifacts in unstructured objects, demonstrating more realistic rendering performance.
