Linear quadratic control of nonlinear systems with Koopman operator learning and the Nyström method

Edoardo Caldarelli; Antoine Chatalic; Adrià Colomé; Cesare Molinari; Carlos Ocampo-Martinez; Carme Torras; Lorenzo Rosasco

Linear quadratic control of nonlinear systems with Koopman operator learning and the Nyström method

Edoardo Caldarelli, Antoine Chatalic, Adrià Colomé, Cesare Molinari, Carlos Ocampo-Martinez, Carme Torras, Lorenzo Rosasco

TL;DR

The paper addresses controlling nonlinear dynamical systems by combining Koopman operator learning with kernel methods in an RKHS and using Nyström approximations to enable scalable, data-driven LQR control. It derives finite-sample bounds linking Nyström approximation errors to both the Riccati operator and the LQR objective, showing convergence rates of $m^{-1/2}$ for the Riccati operator and $m^{-1}$ for the LQR cost. The authors develop a full pipeline: lift the state into an RKHS, learn a linear predictor affine in the input, apply Nyström to obtain a finite-dimensional surrogate, and solve a standard LQR in that surrogate space, with state reconstruction to the original coordinates. Numerical experiments on the Duffing oscillator and cloth manipulation corroborate the theory, demonstrating competitive control performance and improved scalability compared to kernel-based baselines.

Abstract

In this paper, we study how the Koopman operator framework can be combined with kernel methods to effectively control nonlinear dynamical systems. While kernel methods have typically large computational requirements, we show how random subspaces (Nyström approximation) can be used to achieve huge computational savings while preserving accuracy. Our main technical contribution is deriving theoretical guarantees on the effect of the Nyström approximation. More precisely, we study the linear quadratic regulator problem, showing that the approximated Riccati operator converges at the rate $m^{-1/2}$, and the regulator objective, for the associated solution of the optimal control problem, converges at the rate $m^{-1}$, where $m$ is the random subspace size. Theoretical findings are complemented by numerical experiments corroborating our results.

Linear quadratic control of nonlinear systems with Koopman operator learning and the Nyström method

TL;DR

for the Riccati operator and

for the LQR cost. The authors develop a full pipeline: lift the state into an RKHS, learn a linear predictor affine in the input, apply Nyström to obtain a finite-dimensional surrogate, and solve a standard LQR in that surrogate space, with state reconstruction to the original coordinates. Numerical experiments on the Duffing oscillator and cloth manipulation corroborate the theory, demonstrating competitive control performance and improved scalability compared to kernel-based baselines.

Abstract

, and the regulator objective, for the associated solution of the optimal control problem, converges at the rate

, where

is the random subspace size. Theoretical findings are complemented by numerical experiments corroborating our results.

Paper Structure (26 sections, 19 theorems, 137 equations, 7 figures, 2 tables)

This paper contains 26 sections, 19 theorems, 137 equations, 7 figures, 2 tables.

Introduction
Background and notation
Koopman system identification
Choosing the Koopman lifting function
Regression problem and corresponding solutions
Nyström approximation
Kernels and Koopman LQR
Theoretical analysis
Hypotheses
Accuracy of the Nyström approximation of the transition operator
Convergence analysis for the Riccati operator
Convergence analysis of the LQR objective function
Simulation results
Proof-of-concept dynamics
Duffing oscillator
...and 11 more sections

Key Result

Theorem 5

Under Assumption a:bounded_kernel, for any $γ>0$ , it holds with probability $1-δ$ that

Figures (7)

Figure 1: Summary: given some controls and corresponding state trajectories of a nonlinear dynamical system, we use kernels to build a linear, data-driven model of the system. Kernels yield a computationally inefficient representation of the state space, due to the inversion of the kernel matrix, which we render computationally tractable using the Nyström method.
Figure 2: (a): Evaluation of the error between the control law defined in \ref{['e:ctrl_policy']} and the true optimal control for the system in \ref{['e:hjb_sys']}. Median, $15^{th}$ and $85^{th}$ percentile computed across 200 seeds. (b): A qualitative visualization of the optimal control retrieved, for $m=100$. (c): A comparison between the true optimal control, the one defined in \ref{['e:ctrl_policy']}, and its version obtained with an exact kernel, on the state space of the system \ref{['e:hjb_sys']}, for $m=100$. For the Nyström approach, we show median, $15^{th}$ and $85^{th}$ percentile computed across 200 seeds.
Figure 3: (a): $\mathrm{RMSE}_{\%}$ between the true trajectory and the one forecasted in open-loop for the Duffing oscillator, with three different feature representations (splines, eigenfunction approximation by korda2020optimal, and Nyström approximation of the Matérn-5/2 kernel), as a function of the dimensionality of the feature vector, $m$. Median, $15^{th}$ and $85^{th}$ percentile computed across 200 test trajectories with random initial conditions from the unit ball. (b)-(c): The LQR control strategy to stabilize towards the origin, with $m=20$, starting from the initial conditions $[-0.5, 0.0]^T$. Most importantly, when the splines are used, in 2 cases the LQR gain yields unstable nonlinear dynamics (not included in the percentile range). Median, $15^{th}$ and $85^{th}$ percentile computed across 200 seeds.
Figure 4: An example trajectory of the left and right lower corners of the cloth in the $y$-$z$ plane. The circle denotes the starting position, while the triangle the final one.
Figure 5: (a): the RMSE computed between the true cloth trajectory and the one forecasted in open loop, with two different feature representations (splines vs. Nyström approximation of the RBF kernel). Median, $15^{th}$ and $85^{th}$ percentile computed across 10 testing trajectories, sampled with 20 different seeds. (b): regulation error between the target pose of the cloth and the actual one. Median, $15^{th}$ and $85^{th}$ percentile computed across 50 seeds. (c): Scatter plot showing the time time needed by each simulation of the cloth experiment to reach the minimum distance from the target.
...and 2 more figures

Theorems & Definitions (41)

Theorem 5: Convergence rate for $\tilde{G}_γ - G_γ$
proof
Lemma 6: Convergence rate for $\tilde{P} - P$
Remark 7
proof
Theorem 8: Convergence rate for $\hat{\mathcal{J}} - \mathcal{J}$
proof
Remark 9
Lemma 10
proof
...and 31 more

Linear quadratic control of nonlinear systems with Koopman operator learning and the Nyström method

TL;DR

Abstract

Linear quadratic control of nonlinear systems with Koopman operator learning and the Nyström method

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (41)