Table of Contents
Fetching ...

An Real-Sim-Real (RSR) Loop Framework for Generalizable Robotic Policy Transfer with Differentiable Simulation

Lu Shi, Yuxuan Xu, Shiyu Wang, Jinhao Huang, Wenhao Zhao, Yufei Jia, Zike Yan, Weibin Gu, Guyue Zhou

TL;DR

This work tackles the persistent sim-to-real gap in robotic policy transfer by introducing a Real-Sim-Real (RSR) loop that jointly tunes a differentiable simulator and retrains policies. A key contribution is the adaptive InfoGap loss, which combines a task objective with information-theoretic terms to bias data collection toward informative real-world samples and to reduce dataset bias across iterations. Implemented on MuJoCo MJX and evaluated on 6-DOF robotic manipulation tasks, the approach substantially lowers the divergence between simulated and real dynamics (as measured by distributional metrics) and yields better real-world performance and generalization. The framework offers a general, data-efficient pathway for transferring policies from simulation to real robots and can be extended to aerial robotics and other dynamic environments.

Abstract

The sim-to-real gap remains a critical challenge in robotics, hindering the deployment of algorithms trained in simulation to real-world systems. This paper introduces a novel Real-Sim-Real (RSR) loop framework leveraging differentiable simulation to address this gap by iteratively refining simulation parameters, aligning them with real-world conditions, and enabling robust and efficient policy transfer. A key contribution of our work is the design of an informative cost function that encourages the collection of diverse and representative real-world data, minimizing bias and maximizing the utility of each data point for simulation refinement. This cost function integrates seamlessly into existing reinforcement learning algorithms (e.g., PPO, SAC) and ensures a balanced exploration of critical regions in the real domain. Furthermore, our approach is implemented on the versatile Mujoco MJX platform, and our framework is compatible with a wide range of robotic systems. Experimental results on several robotic manipulation tasks demonstrate that our method significantly reduces the sim-to-real gap, achieving high task performance and generalizability across diverse scenarios of both explicit and implicit environmental uncertainties.

An Real-Sim-Real (RSR) Loop Framework for Generalizable Robotic Policy Transfer with Differentiable Simulation

TL;DR

This work tackles the persistent sim-to-real gap in robotic policy transfer by introducing a Real-Sim-Real (RSR) loop that jointly tunes a differentiable simulator and retrains policies. A key contribution is the adaptive InfoGap loss, which combines a task objective with information-theoretic terms to bias data collection toward informative real-world samples and to reduce dataset bias across iterations. Implemented on MuJoCo MJX and evaluated on 6-DOF robotic manipulation tasks, the approach substantially lowers the divergence between simulated and real dynamics (as measured by distributional metrics) and yields better real-world performance and generalization. The framework offers a general, data-efficient pathway for transferring policies from simulation to real robots and can be extended to aerial robotics and other dynamic environments.

Abstract

The sim-to-real gap remains a critical challenge in robotics, hindering the deployment of algorithms trained in simulation to real-world systems. This paper introduces a novel Real-Sim-Real (RSR) loop framework leveraging differentiable simulation to address this gap by iteratively refining simulation parameters, aligning them with real-world conditions, and enabling robust and efficient policy transfer. A key contribution of our work is the design of an informative cost function that encourages the collection of diverse and representative real-world data, minimizing bias and maximizing the utility of each data point for simulation refinement. This cost function integrates seamlessly into existing reinforcement learning algorithms (e.g., PPO, SAC) and ensures a balanced exploration of critical regions in the real domain. Furthermore, our approach is implemented on the versatile Mujoco MJX platform, and our framework is compatible with a wide range of robotic systems. Experimental results on several robotic manipulation tasks demonstrate that our method significantly reduces the sim-to-real gap, achieving high task performance and generalizability across diverse scenarios of both explicit and implicit environmental uncertainties.

Paper Structure

This paper contains 22 sections, 6 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Overview of the proposed RSR (Real-to-Sim-to-Real) Loop Framework (marked in red), consisting of two key feedback loops. The "Sim-Env Parameters Tuning Loop" (marked in green) adjusts the simulator parameters by utilizing data from the real robot. This loop iterates (indexed by $i$) to fine-tune the simulator, reducing the sim-to-real gap. The "Policy Training Loop" (marked in blue) utilized the tuned simulator of the current iteration $k$ and the adaptive InfoGap loss to further train a policy for the next iteration. Together, these loops facilitate continuous improvement of both the policy and the simulator to enhance real-world performance.
  • Figure 2: The process of bridging the sim-to-real gap in robot training. When the discrepancy between the simulation (blue domain) and real robot (orange domain) is large, the policy prioritizes collecting informative data (marked as crosses) from the real domain to better characterize its properties other than the task trajectory (black dashed line).
  • Figure 3: Real-world pushing trajectories across different iterations.
  • Figure 4: Mean trajectory error over three trials in the X and Y directions against time for different iterations.
  • Figure 5: 1-$\sigma$ bounds of real trajectories for the yaw angle in the T-shaped block pushing trials for different iterations, where the shaded area represents the bounds.
  • ...and 1 more figures