Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation
Haozhe Lou, Yurong Liu, Yike Pan, Yiran Geng, Jianteng Chen, Wenlong Ma, Chenglong Li, Lin Wang, Hengzhen Feng, Lu Shi, Liyi Luo, Yongliang Shi
TL;DR
The paper tackles Real2Sim2Real gaps in robotic arm manipulation by introducing a hybrid representation that fuses mesh geometry, Gaussian primitives, and physics attributes through a Gaussian-Mesh-Pixel binding. This binding enables a differentiable pipeline where real video, simulation, and rendering share a common spatiotemporal representation, supported by URDF-based kinematics and Newton-Euler dynamics. Key contributions include a unified asset representation, mesh extraction and alignment techniques, physics-aware forward and dynamic equations, and a comprehensive dataset proposal for end-to-end policy training. Experimental results demonstrate improved mesh quality, high-fidelity rendering, and manipulable models capable of Sim2Real and novel-policy editing, with potential to enhance real-world robotic control and learning. The approach advances the state-of-the-art in physics-consistent digital twins and enables more reliable policy transfer and vision-based manipulation.
Abstract
Real2Sim2Real plays a critical role in robotic arm control and reinforcement learning, yet bridging this gap remains a significant challenge due to the complex physical properties of robots and the objects they manipulate. Existing methods lack a comprehensive solution to accurately reconstruct real-world objects with spatial representations and their associated physics attributes. We propose a Real2Sim pipeline with a hybrid representation model that integrates mesh geometry, 3D Gaussian kernels, and physics attributes to enhance the digital asset representation of robotic arms. This hybrid representation is implemented through a Gaussian-Mesh-Pixel binding technique, which establishes an isomorphic mapping between mesh vertices and Gaussian models. This enables a fully differentiable rendering pipeline that can be optimized through numerical solvers, achieves high-fidelity rendering via Gaussian Splatting, and facilitates physically plausible simulation of the robotic arm's interaction with its environment using mesh-based methods. The code,full presentation and datasets will be made publicly available at our website https://robostudioapp.com
