Table of Contents
Fetching ...

DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Joshua Bagajo, Clemens Schwarke, Victor Klemm, Ignat Georgiev, Jean-Pierre Sleiman, Jesus Tordesillas, Animesh Garg, Marco Hutter

TL;DR

This work demonstrates that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world, marking the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation.

Abstract

Differentiable simulators provide analytic gradients, enabling more sample-efficient learning algorithms and paving the way for data intensive learning tasks such as learning from images. In this work, we demonstrate that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world. Typically, simulators that offer informative gradients lack the physical accuracy needed for sim-to-real transfer, and vice-versa. A key factor in our success is a smooth contact model that combines informative gradients with physical accuracy, ensuring effective transfer of learned behaviors. To the best of our knowledge, this is the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation.

DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

TL;DR

This work demonstrates that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world, marking the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation.

Abstract

Differentiable simulators provide analytic gradients, enabling more sample-efficient learning algorithms and paving the way for data intensive learning tasks such as learning from images. In this work, we demonstrate that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world. Typically, simulators that offer informative gradients lack the physical accuracy needed for sim-to-real transfer, and vice-versa. A key factor in our success is a smooth contact model that combines informative gradients with physical accuracy, ensuring effective transfer of learned behaviors. To the best of our knowledge, this is the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation.

Paper Structure

This paper contains 4 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: A quadrupedal robot learning to walk on flat terrain in a differentiable simulation. Policies trained with a hard contact model follow unreasonable foothold patterns and do not learn to locomote robustly. Training with a soft contact model results in stable locomotive gaits but the learned behaviors do not transfer to real hardware. Policies trained with an analytically smooth contact model exhibit effective and stable locomotive gaits and transfer to the real world. Video: https://youtu.be/2wZmmUyqUQM.
  • Figure 2: The normal contact force (left) and its gradient (right) with respect to the penetration depth between two contacting bodies. Hard contact (blue) exhibits a discontinuity at $d=0$. Its analytical gradient is zero almost everywhere. Soft contact (orange) is continuous but does not accurately model stiff contact without becoming unstable because the normal force is unbounded. Stochastically smoothing hard contact (cyan) removes the discontinuity, but the FoG gradient remains zero and thus uninformative. Analytically smoothing hard contact (green) induces similar effects on the dynamics as stochastic smoothing, with the advantage of an informative FoG.