Data-driven optimal control of unknown nonlinear dynamical systems using the Koopman operator
Zhexuan Zeng, Ruikun Zhou, Yiming Meng, Jun Liu
TL;DR
This work tackles data-driven optimal control for unknown nonlinear systems by marrying a modified Koopman operator framework with model-based reinforcement learning. It relaxes observable-function requirements to better capture nonlinear state-input terms and uses a neural PDE solver-based approach to scale PDE-based value-function computation to high dimensions. The authors establish convergence guarantees for the learned value function and policies and demonstrate strong empirical performance on systems up to 9 states and 4 inputs, achieving accumulated-cost errors between $10^{-5}$ and $10^{-3}$. This framework offers a certifiable, scalable path for identifying dynamics and synthesizing stabilizing controllers directly from data in complex, high-dimensional settings.
Abstract
Nonlinear optimal control is vital for numerous applications but remains challenging for unknown systems due to the difficulties in accurately modelling dynamics and handling computational demands, particularly in high-dimensional settings. This work develops a theoretically certifiable framework that integrates a modified Koopman operator approach with model-based reinforcement learning to address these challenges. By relaxing the requirements on observable functions, our method incorporates nonlinear terms involving both states and control inputs, significantly enhancing system identification accuracy. Moreover, by leveraging the power of neural networks to solve partial differential equations (PDEs), our approach is able to achieving stabilizing control for high-dimensional dynamical systems, up to 9-dimensional. The learned value function and control laws are proven to converge to those of the true system at each iteration. Additionally, the accumulated cost of the learned control closely approximates that of the true system, with errors ranging from $10^{-5}$ to $10^{-3}$.
