CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control
Se Hwan Jeon, Seungwoo Hong, Ho Jae Lee, Charles Khazoom, Sangbae Kim
TL;DR
The paper tackles the bottleneck of integrating model-based optimization into large-scale reinforcement learning by enabling GPU-based parallelization of symbolic expressions. It introduces CusADi, an extension of the CasADi framework that auto-generates CUDA kernels to evaluate arbitrary symbolic expressions in parallel across thousands of environments, and formulates a closed-form, fixed-iteration approximation to the optimal control problem to enable scalable MPC. The authors provide code generation and a PyTorch interface, demonstrate speedups up to 1000x over serial CPU and substantial gains when data remains on-device, and validate the approach through MIT Humanoid MPC, centroidal momentum augmentation in RL, and parallelized quadcopter rollouts. The work offers a practical pathway to integrate model-based optimization into RL pipelines, enabling rapid parallel simulations, parameter sweeps, and policy training with large batch sizes.
Abstract
The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi, an extension of the CasADi symbolic framework to support the parallelization of arbitrary closed-form expressions on GPUs with CUDA. We also formulate a closed-form approximation for solving general optimal control problems, enabling large-scale parallelization and evaluation of MPC controllers. Our results show a ten-fold speedup relative to similar MPC implementation on the CPU, and we demonstrate the use of CusADi for various applications, including parallel simulation, parameter sweeps, and policy training.
