Table of Contents
Fetching ...

A Differentiable Physics Engine for Deep Learning in Robotics

Jonas Degrave, Michiel Hermans, Joni Dambre, Francis wyffels

TL;DR

This work addresses the bottleneck of optimizing robotic controllers with non-differentiable physics by introducing a differentiable 3D rigid-body engine implemented in Theano. By enabling analytic gradients and backpropagation through time, the authors demonstrate gradient-based optimization of controllers, including networks with millions of parameters, and show substantial speedups over derivative-free methods. They validate the approach across tasks such as ball throwing, a 4-DOF robot arm, quadruped gait, and a vision-based pendulum, illustrating scalability, GPU batch efficiency, and end-to-end differentiability through perception. The results suggest a practical path for integrating deep learning with physics-driven robotics, offering an alternative to deep Q-learning and enabling faster hardware-software co-design and learning from complex sensors like cameras.

Abstract

An important field in robotics is the optimization of controllers. Currently, robots are often treated as a black box in this optimization process, which is the reason why derivative-free optimization methods such as evolutionary algorithms or reinforcement learning are omnipresent. When gradient-based methods are used, models are kept small or rely on finite difference approximations for the Jacobian. This method quickly grows expensive with increasing numbers of parameters, such as found in deep learning. We propose the implementation of a modern physics engine, which can differentiate control parameters. This engine is implemented for both CPU and GPU. Firstly, this paper shows how such an engine speeds up the optimization process, even for small problems. Furthermore, it explains why this is an alternative approach to deep Q-learning, for using deep learning in robotics. Finally, we argue that this is a big step for deep learning in robotics, as it opens up new possibilities to optimize robots, both in hardware and software.

A Differentiable Physics Engine for Deep Learning in Robotics

TL;DR

This work addresses the bottleneck of optimizing robotic controllers with non-differentiable physics by introducing a differentiable 3D rigid-body engine implemented in Theano. By enabling analytic gradients and backpropagation through time, the authors demonstrate gradient-based optimization of controllers, including networks with millions of parameters, and show substantial speedups over derivative-free methods. They validate the approach across tasks such as ball throwing, a 4-DOF robot arm, quadruped gait, and a vision-based pendulum, illustrating scalability, GPU batch efficiency, and end-to-end differentiability through perception. The results suggest a practical path for integrating deep learning with physics-driven robotics, offering an alternative to deep Q-learning and enabling faster hardware-software co-design and learning from complex sensors like cameras.

Abstract

An important field in robotics is the optimization of controllers. Currently, robots are often treated as a black box in this optimization process, which is the reason why derivative-free optimization methods such as evolutionary algorithms or reinforcement learning are omnipresent. When gradient-based methods are used, models are kept small or rely on finite difference approximations for the Jacobian. This method quickly grows expensive with increasing numbers of parameters, such as found in deep learning. We propose the implementation of a modern physics engine, which can differentiate control parameters. This engine is implemented for both CPU and GPU. Firstly, this paper shows how such an engine speeds up the optimization process, even for small problems. Furthermore, it explains why this is an alternative approach to deep Q-learning, for using deep learning in robotics. Finally, we argue that this is a big step for deep learning in robotics, as it opens up new possibilities to optimize robots, both in hardware and software.

Paper Structure

This paper contains 13 sections, 3 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Illustration of how a closed loop neural network controller would be used to actuate a robot. The neural network receives sensor signals from the sensors on the robot and uses these to generate motor signals which are sent to the servo motors. The neural network can also generate a signal which it can use at the next timestep to control the robot.
  • Figure 2: Illustration of the dynamic system with the robot and controller, after unrolling over time. The neural networks $g_{\textrm{deep}}$ and $h_{\textrm{deep}}$ with weights $\mathbf{W}$ receive sensor signals $\mathbf{s}^{t}$ from the sensors on the robot and use these to generate motor signals $\mathbf{u}^{t}$ which are used by the physics engine $f_{\textrm{ph}}$ to find the next state of the robot in the physical system. These neural networks also have a memory, implemented with recurrent connections $\mathbf{h}^{t}$. From the state $\mathbf{x}^t$ of these robots, the loss $\mathcal{L}$ can be found. In order to find $d \mathcal{L} /d\mathbf{W}$, every block in this chart needs to be differentiable. The contribution of this paper, is to implement a differentiable $f_{\textrm{ph}}$, which allows us to optimize $\mathbf{W}$ to minimize $\mathcal{L}$ more efficiently than was possible before.
  • Figure 3: (A) Illustration of the ball model used in the first task. (B) Illustration of the quadruped robot model with 8 actuated degrees of freedom, 1 in each shoulder, 1 in each elbow. The spine of the robot can collide with the ground, through 4 spheres in the inside of the cuboid. (C) Illustration of the robot arm model with 4 actuated degrees of freedom.
  • Figure 4: A frame captured by the differentiable camera looking at the model of the pendulum-cart system. The resolution used is 288 by 96 pixels. All the textures are made from pictures of the actual system.
  • Figure 5: The camera model used to convert the three dimensional point $P$ into a two dimensional pixel on the projection plane $(u,v)$.