Differentiable Model Predictive Control on the GPU
Emre Adabag, Marcus Greiff, John Subosits, Thomas Lew
TL;DR
This paper presents DiffMPC, a GPU-accelerated differentiable MPC solver that combines SQP for the forward OCP solve with a preconditioned conjugate gradient (PCG) method using a tridiagonal preconditioner to exploit time-sequential structure. By reusing a precomputed KKT matrix and leveraging parallelism over time and problem instances, DiffMPC achieves substantial speedups (often >4×) over existing CPU and GPU baselines in reinforcement learning and imitation learning tasks and scales to large batch training. The backward pass uses the implicit function theorem to compute sensitivities with respect to problem parameters, enabling end-to-end differentiable policies and learning of cost/constraint parameters, including domain-randomized dynamics for driving at the limits. The method is demonstrated on driving scenarios with domain randomization, showing improved robustness and transfer to real-vehicle drifting tasks, while also detailing limitations and avenues for future improvements, such as handling inequality constraints more thoroughly and CPU-side performance enhancements.
Abstract
Differentiable model predictive control (MPC) offers a powerful framework for combining learning and control. However, its adoption has been limited by the inherently sequential nature of traditional optimization algorithms, which are challenging to parallelize on modern computing hardware like GPUs. In this work, we tackle this bottleneck by introducing a GPU-accelerated differentiable optimization tool for MPC. This solver leverages sequential quadratic programming and a custom preconditioned conjugate gradient (PCG) routine with tridiagonal preconditioning to exploit the problem's structure and enable efficient parallelization. We demonstrate substantial speedups over CPU- and GPU-based baselines, significantly improving upon state-of-the-art training times on benchmark reinforcement learning and imitation learning tasks. Finally, we showcase the method on the challenging task of reinforcement learning for driving at the limits of handling, where it enables robust drifting of a Toyota Supra through water puddles.
