Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions
Kaifeng Zhang, Shuo Sha, Hanxiao Jiang, Matthew Loper, Hyunjong Song, Guangyan Cai, Zhuo Xu, Xiaochen Hu, Changxi Zheng, Yunzhu Li
TL;DR
This work presents a real-to-sim framework for evaluating robotic manipulation policies on deformable objects by building soft-body digital twins from real videos and rendering them with photorealistic Gaussian Splatting, complemented by PhysTwin-based dynamics. By jointly addressing appearance and dynamics fidelity, the method achieves strong sim-to-real correlation (Pearson r > 0.9) across tasks like plush toy packing, rope routing, and T-block pushing, outperforming a baseline simulator. The approach enables reproducible, scalable policy evaluation and selection of promising checkpoints without co-training in simulation. Ablation studies demonstrate that both color alignment and physics optimization are crucial for reliable prediction of real-world performance, offering concrete guidance for future simulation-based benchmarking. The work shows practical potential for accelerating robotics research by providing trustworthy evaluators that closely reflect real-world outcomes.
Abstract
Robotic manipulation policies are advancing rapidly, but their direct evaluation in the real world remains costly, time-consuming, and difficult to reproduce, particularly for tasks involving deformable objects. Simulation provides a scalable and systematic alternative, yet existing simulators often fail to capture the coupled visual and physical complexity of soft-body interactions. We present a real-to-sim policy evaluation framework that constructs soft-body digital twins from real-world videos and renders robots, objects, and environments with photorealistic fidelity using 3D Gaussian Splatting. We validate our approach on representative deformable manipulation tasks, including plush toy packing, rope routing, and T-block pushing, demonstrating that simulated rollouts correlate strongly with real-world execution performance and reveal key behavioral patterns of learned policies. Our results suggest that combining physics-informed reconstruction with high-quality rendering enables reproducible, scalable, and accurate evaluation of robotic manipulation policies. Website: https://real2sim-eval.github.io/
