Agile Robotics: Optimal Control, Reinforcement Learning, and Differentiable Simulation
Yunlong Song, Davide Scaramuzza
TL;DR
The paper compares continuous-time optimal control, model predictive control, reinforcement learning, and differentiable simulation as pathways to agile robot control. It formalizes continuous-time optimal control with $\min_{x(\cdot), u(\cdot)} \int_{0}^{T} \ell(x(t),u(t),t)\,dt + \ell(x(T))$, MPC with the discrete objective $J(x,u) = \sum_{k=0}^{N-1} \ell(x_k,u_k) + \ell(x_N)$, and policy learning with $J(\theta) = \mathbb{E}_{\tau \sim \pi_{\theta}} [\sum_{k} r_k]$, highlighting distinct optimization targets. Results show RL's robustness due to optimizing task-level rewards and using domain randomization, outperforming OC in challenging drone-racing scenarios. The paper presents a policy-search-for-MPC approach enabling offline learning of high-level decision variables for real-time MPC, as well as a differentiable-simulation framework for rapid, gradient-based learning of legged locomotion with zero-shot real-world transfer. Future work calls for integrating structured dynamics and OC constraints into RL to reduce sample complexity and extending to vision-based humanoid locomotion.
Abstract
Control systems are at the core of every real-world robot. They are deployed in an ever-increasing number of applications, ranging from autonomous racing and search-and-rescue missions to industrial inspections and space exploration. To achieve peak performance, certain tasks require pushing the robot to its maximum agility. How can we design control algorithms that enhance the agility of autonomous robots and maintain robustness against unforeseen disturbances? This paper addresses this question by leveraging fundamental principles in optimal control, reinforcement learning, and differentiable simulation.
