Table of Contents
Fetching ...

Re$^3$Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation

Xiaoshen Han, Minghuan Liu, Yilun Chen, Junqiu Yu, Xiaoyang Lyu, Yang Tian, Bolun Wang, Weinan Zhang, Jiangmiao Pang

TL;DR

This work presents RE$^3$Sim, a real-to-sim-real pipeline that closes both geometric and visual gaps between real and simulated robotics environments by coupling 3D reconstruction with Gaussian-based rendering. It demonstrates rapid scene setup, real-time cross-view rendering, and zero-shot sim-to-real transfer for tabletop manipulation using a privileged data generation strategy and imitation learning. Large-scale synthetic datasets enable policies that generalize across objects and tasks, reducing reliance on costly real-world data. The approach offers a scalable path to high-fidelity simulation data for pre-training robust robotic manipulation policies with practical impact on deployment efficiency and generalization.

Abstract

Real-world data collection for robotics is costly and resource-intensive, requiring skilled operators and expensive hardware. Simulations offer a scalable alternative but often fail to achieve sim-to-real generalization due to geometric and visual gaps. To address these challenges, we propose a 3D-photorealistic real-to-sim system, namely, RE$^3$SIM, addressing geometric and visual sim-to-real gaps. RE$^3$SIM employs advanced 3D reconstruction and neural rendering techniques to faithfully recreate real-world scenarios, enabling real-time rendering of simulated cross-view cameras within a physics-based simulator. By utilizing privileged information to collect expert demonstrations efficiently in simulation, and train robot policies with imitation learning, we validate the effectiveness of the real-to-sim-to-real pipeline across various manipulation task scenarios. Notably, with only simulated data, we can achieve zero-shot sim-to-real transfer with an average success rate exceeding 58%. To push the limit of real-to-sim, we further generate a large-scale simulation dataset, demonstrating how a robust policy can be built from simulation data that generalizes across various objects. Codes and demos are available at: http://xshenhan.github.io/Re3Sim/.

Re$^3$Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation

TL;DR

This work presents RESim, a real-to-sim-real pipeline that closes both geometric and visual gaps between real and simulated robotics environments by coupling 3D reconstruction with Gaussian-based rendering. It demonstrates rapid scene setup, real-time cross-view rendering, and zero-shot sim-to-real transfer for tabletop manipulation using a privileged data generation strategy and imitation learning. Large-scale synthetic datasets enable policies that generalize across objects and tasks, reducing reliance on costly real-world data. The approach offers a scalable path to high-fidelity simulation data for pre-training robust robotic manipulation policies with practical impact on deployment efficiency and generalization.

Abstract

Real-world data collection for robotics is costly and resource-intensive, requiring skilled operators and expensive hardware. Simulations offer a scalable alternative but often fail to achieve sim-to-real generalization due to geometric and visual gaps. To address these challenges, we propose a 3D-photorealistic real-to-sim system, namely, RESIM, addressing geometric and visual sim-to-real gaps. RESIM employs advanced 3D reconstruction and neural rendering techniques to faithfully recreate real-world scenarios, enabling real-time rendering of simulated cross-view cameras within a physics-based simulator. By utilizing privileged information to collect expert demonstrations efficiently in simulation, and train robot policies with imitation learning, we validate the effectiveness of the real-to-sim-to-real pipeline across various manipulation task scenarios. Notably, with only simulated data, we can achieve zero-shot sim-to-real transfer with an average success rate exceeding 58%. To push the limit of real-to-sim, we further generate a large-scale simulation dataset, demonstrating how a robust policy can be built from simulation data that generalizes across various objects. Codes and demos are available at: http://xshenhan.github.io/Re3Sim/.

Paper Structure

This paper contains 26 sections, 3 equations, 14 figures, 7 tables.

Figures (14)

  • Figure 1: Illustration of RE$^3$SIM. a) RE$^3$SIM allows zero-shot policy transfer on various tasks. b) The system pipeline to generate high-quality data. c) High-fidelity rendering results. d) Consistency in success rates between real and simulated environments.
  • Figure 2: Illustration of the proposed real-to-sim-to-real system, Re$^3$Sim. It leverages 3D reconstruction and a physics-based simulator, providing small 3D gaps that enable large-scale simulation data generation for learning manipulation skills via sim-to-real transfer.
  • Figure 3: Visual comparison between real and simulation. Rendering results from our hybrid rendering method compared with photos captured by real-world cameras, highlighting the high fidelity and realism achieved by our approach.
  • Figure 4: Real-world evaluation and robustness test for large-scale sim-to-real. The success rate reflects the proportion of trials in which all objects were successfully grasped, while the grasp rate indicates the proportion of objects grasped relative to the total number on the table. See qualitative results in the http://xshenhan.github.io/Re3Sim/.
  • Figure 5: Data scaling effects, tested on seen objects in the real world.
  • ...and 9 more figures