Table of Contents
Fetching ...

End-to-End and Highly-Efficient Differentiable Simulation for Robotics

Quentin Le Lidec, Louis Montaut, Yann de Mont-Marin, Fabian Schramm, Justin Carpentier

TL;DR

The paper tackles the challenge of computing accurate derivatives through contact-rich robotic simulators by introducing an end-to-end differentiable framework that unifies differentiable rigid-body dynamics, collision detection, and frictional contact resolution. It leverages implicit differentiation of the nonlinear complementarity problem (NCP) governing frictional contacts, while exploiting kinematic sparsity to achieve substantial speedups and enable gradient-based optimization in learning and control tasks. Key contributions include a reduced, mode-aware differentiation of the NCP (covering braking, sticking, and sliding modes), end-to-end gradient chaining, and collision-detection-aware derivatives, all implemented in high-performance C++ with state-of-the-art timings and practical validations on inverse problems and policy learning. The approach preserves non-relaxed physics, offering high-fidelity gradients that improve sample efficiency in policy optimization and enable precise inverse problem solving, albeit with attention required for gradient stability in non-smooth regimes. This work paves the way for faster, more reliable model-based optimization and MPC with non-smooth contact dynamics, and suggests future directions toward smoothing strategies and extensions to soft or deformable contact models for real-world applicability.

Abstract

Over the past few years, robotics simulators have largely improved in efficiency and scalability, enabling them to generate years of simulated data in a few hours. Yet, efficiently and accurately computing the simulation derivatives remains an open challenge, with potentially high gains on the convergence speed of reinforcement learning and trajectory optimization algorithms, especially for problems involving physical contact interactions. This paper contributes to this objective by introducing a unified and efficient algorithmic solution for computing the analytical derivatives of robotic simulators. The approach considers both the collision and frictional stages, accounting for their intrinsic nonsmoothness and also exploiting the sparsity induced by the underlying multibody systems. These derivatives have been implemented in C++, and the code will be open-sourced in the Simple simulator. They depict state-of-the-art timings ranging from 5 microseconds for a 7-dof manipulator up to 95 microseconds for 36-dof humanoid, outperforming alternative solutions by a factor of at least 100.

End-to-End and Highly-Efficient Differentiable Simulation for Robotics

TL;DR

The paper tackles the challenge of computing accurate derivatives through contact-rich robotic simulators by introducing an end-to-end differentiable framework that unifies differentiable rigid-body dynamics, collision detection, and frictional contact resolution. It leverages implicit differentiation of the nonlinear complementarity problem (NCP) governing frictional contacts, while exploiting kinematic sparsity to achieve substantial speedups and enable gradient-based optimization in learning and control tasks. Key contributions include a reduced, mode-aware differentiation of the NCP (covering braking, sticking, and sliding modes), end-to-end gradient chaining, and collision-detection-aware derivatives, all implemented in high-performance C++ with state-of-the-art timings and practical validations on inverse problems and policy learning. The approach preserves non-relaxed physics, offering high-fidelity gradients that improve sample efficiency in policy optimization and enable precise inverse problem solving, albeit with attention required for gradient stability in non-smooth regimes. This work paves the way for faster, more reliable model-based optimization and MPC with non-smooth contact dynamics, and suggests future directions toward smoothing strategies and extensions to soft or deformable contact models for real-world applicability.

Abstract

Over the past few years, robotics simulators have largely improved in efficiency and scalability, enabling them to generate years of simulated data in a few hours. Yet, efficiently and accurately computing the simulation derivatives remains an open challenge, with potentially high gains on the convergence speed of reinforcement learning and trajectory optimization algorithms, especially for problems involving physical contact interactions. This paper contributes to this objective by introducing a unified and efficient algorithmic solution for computing the analytical derivatives of robotic simulators. The approach considers both the collision and frictional stages, accounting for their intrinsic nonsmoothness and also exploiting the sparsity induced by the underlying multibody systems. These derivatives have been implemented in C++, and the code will be open-sourced in the Simple simulator. They depict state-of-the-art timings ranging from 5 microseconds for a 7-dof manipulator up to 95 microseconds for 36-dof humanoid, outperforming alternative solutions by a factor of at least 100.
Paper Structure (26 sections, 45 equations, 6 figures, 4 tables)

This paper contains 26 sections, 45 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Illustration of the sliding mode. $\bm \lambda^*$ lives in the boundary of the cone $\mathcal{K}_\mu$ in the direction opposite to $\bm \sigma = \bm \sigma_T$ and the variation $\text{d} \bm \lambda^*$ lies inside the tangent plane.
  • Figure 2: The robotics systems used to evaluate our approach range from simple systems such as MuJoCo's half-cheetah (Left) to more complex high-dof robots such as Unitree's Go1 (Center) and H1 (Right)
  • Figure 3: Estimation of initial conditions. A Gauss-Newton (GN) algorithm can leverage the efficient implicit differentiation to accurately retrieve the initial velocity $\bm v_0$ and impulse $\bm \tau_0$. On the third and fourth figures, the black curve representing Gradient Descent with finite differences rises due to the excessively large estimated gradients. When at the boundary of a contact mode, the norm of the finite differences gradients becomes inversely proportional to the step size used.
  • Figure 4: Contact inverse dynamics on an underactuated Go1 quadruped can be efficiently performed via a Gauss-Newton algorithm by leveraging the derivatives of our differentiable simulator.
  • Figure 5: SHAC vs. PPO on cartpole swing up (upper) and hopper (lower). SHAC algorithm leverages differentiable simulation to achieve improved sample efficiency.
  • ...and 1 more figures