RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

Yuxuan Wu; Lei Pan; Wenhua Wu; Guangming Wang; Yanzi Miao; Fan Xu; Hesheng Wang

RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

Yuxuan Wu, Lei Pan, Wenhua Wu, Guangming Wang, Yanzi Miao, Fan Xu, Hesheng Wang

TL;DR

This work tackles the sim-to-real gap in vision-based robotic manipulation by introducing RL-GSBridge, a Real2Sim2Real framework that leverages 3D Gaussian Splatting to construct realistic, editable scene representations from real imagery. It combines a soft mesh binding GS model for accurate geometry and texture with physics-informed GS editing to synchronize visuals with dynamics, enabling zero-shot transfer from simulation to real robots. The method uses PyBullet for dynamics training with SAC (and SACwB guidance) and a GS renderer to produce realistic observations, then transfers directly to real hardware without fine-tuning. Experiments on grasping and pick-and-place tasks demonstrate robust sim-to-real transfer and high rendering fidelity, with fewer artifacts than hard-binding GS methods and strong behavior consistency between sim and real environments.

Abstract

Sim-to-Real refers to the process of transferring policies learned in simulation to the real world, which is crucial for achieving practical robotics applications. However, recent Sim2real methods either rely on a large amount of augmented data or large learning models, which is inefficient for specific tasks. In recent years, with the emergence of radiance field reconstruction methods, especially 3D Gaussian splatting, it has become possible to construct realistic real-world scenes. To this end, we propose RL-GSBridge, a novel real-to-sim-to-real framework which incorporates 3D Gaussian Splatting into the conventional RL simulation pipeline, enabling zero-shot sim-to-real transfer for vision-based deep reinforcement learning. We introduce a mesh-based 3D GS method with soft binding constraints, enhancing the rendering quality of mesh models. Then utilizing a GS editing approach to synchronize the rendering with the physics simulator, RL-GSBridge could reflect the visual interactions of the physical robot accurately. Through a series of sim-to-real experiments, including grasping and pick-and-place tasks, we demonstrate that RL-GSBridge maintains a satisfactory success rate in real-world task completion during sim-to-real transfer. Furthermore, a series of rendering metrics and visualization results indicate that our proposed mesh-based 3D GS reduces artifacts in unstructured objects, demonstrating more realistic rendering performance.

RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

TL;DR

Abstract

Paper Structure (22 sections, 9 equations, 6 figures, 4 tables)

This paper contains 22 sections, 9 equations, 6 figures, 4 tables.

INTRODUCTION
RELATED WORK
Sim2Real Transfer in RL
Radiance Field in Robotics
METHODS
Real2Sim: Building simulator with soft mesh binding GS
Real-world Data Preparing
Real2Sim modeling by soft mesh binding GS
Physic Dynamics-Based GS Editing
Sim2Real: Train in simulation with physic dynamics-based GS renderer and zero-shot transfer to reality
EXPERIMENTS
Experiment Setup
Robot Platform
Tasks
Evaluation Setup
...and 7 more sections

Figures (6)

Figure 1: Pipeline of RL-GSBridge. (1) Real2Sim Environment Transfer. Real-world scenarios is reconstructed through a novel soft mesh binding GS model. (2) Learn Policy at Simulator with GS Render. With physical dynamics-based GS editing, RL policies learn through realistic rendered images in simulation. (3) Zero-shot Real-world Robot Manipulation. We directly apply the policy to real-world tasks without fine-tuning.
Figure 2: Mesh-based GS Reconstruction with Soft Binding Constraints: Releasing the hard constraints of GaMeS waczynska2024games in the normal direction for smoother and more flexible object surfaces.
Figure 3: Policy training pipeline in RL-GSBridge. In the upper half of the figure, physic dynamics-based GS editing receives the transformation signals of objects and synchronizes the states of GS models. In the lower half of the figure, an actor-critic RL network receives first-person perspective images rendered by GS models as input, to learn a vision-based manipulation policy.
Figure 4: Comparison of sim-to-real behavior consistency between RL-GSBridge and RL-sim.
Figure 5: Our soft binding constraint reconstruction method compared with GaMeS waczynska2024games on two foreground objects and two backgrounds.
...and 1 more figures

RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

TL;DR

Abstract

RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)