Scaling Law of Neural Koopman Operators

Abulikemu Abuduweili; Yuyang Pang; Feihan Li; Changliu Liu

Scaling Law of Neural Koopman Operators

Abulikemu Abuduweili, Yuyang Pang, Feihan Li, Changliu Liu

TL;DR

A theoretical upper bound on the Koopman approximation error is derived, explicitly decomposing it into sampling error and projection error, and two lightweight regularizers for the neural Koopman operator are introduced, providing a rigorous basis for the scaling law.

Abstract

Data-driven neural Koopman operator theory has emerged as a powerful tool for linearizing and controlling nonlinear robotic systems. However, the performance of these data-driven models fundamentally depends on the trade-off between sample size and model dimensions, a relationship for which the scaling laws have remained unclear. This paper establishes a rigorous framework to address this challenge by deriving and empirically validating scaling laws that connect sample size, latent space dimension, and downstream control quality. We derive a theoretical upper bound on the Koopman approximation error, explicitly decomposing it into sampling error and projection error. We show that these terms decay at specific rates relative to dataset size and latent dimension, providing a rigorous basis for the scaling law. Based on the theoretical results, we introduce two lightweight regularizers for the neural Koopman operator: a covariance loss to help stabilize the learned latent features and an inverse control loss to ensure the model aligns with physical actuation. The results from systematic experiments across six robotic environments confirm that model fitting error follows the derived scaling laws, and the regularizers improve dynamic model fitting fidelity, with enhanced closed-loop control performance. Together, our results provide a simple recipe for allocating effort between data collection and model capacity when learning Koopman dynamics for control.

Scaling Law of Neural Koopman Operators

TL;DR

Abstract

Paper Structure (46 sections, 7 theorems, 59 equations, 12 figures, 8 tables)

This paper contains 46 sections, 7 theorems, 59 equations, 12 figures, 8 tables.

Introduction
Preliminary: Koopman Operator Theory
Theoretical analysis of Scaling Law
Error Decomposition
Koopman Operator Theory
EDMD (Extended Dynamic Mode Decomposition)
Koopman Approximation Error
Convergence of Sampling error
Convergence of Projection Error
Convergence of Koopman Operator
More discussion about the assumptions
Assumption 6: Spectral Decay
Assumption 7: Subspace Approximation Capability
Optimization Error and the Scaling Floor
Practical Neural Koopman Operators
...and 31 more sections

Key Result

proposition 1

Under assumption:iidassumption:boundedassumption:gram, with probability at least $1 - 2 \delta - \delta_m$, the sampling error is bounded by:

Figures (12)

Figure 2: Scaling of prediction error with training sample size ($m$). Subplots display error curves across latent dimension multipliers ($n_{\text{multi}}$). Dots denote random seeds, and solid lines show fitted power-law trends. The legend reports the scaling exponent $\alpha$ (error decay rate). Increasing dataset size consistently reduces error via a power law across environments, empirically validating theoretical sampling error convergence.
Figure 3: Scaling of prediction error with latent dimension $n$. Subplots display error curves across training sample sizes ($m$). Increasing latent dimension generally reduces error, empirically validating projection error convergence as Koopman subspace capacity expands.
Figure 4: Scaling of prediction error with latent dimension ($n$) under the coupled scaling law $m = \text{coeff} \cdot n \ln n$. As the data sufficiency coefficient increases, the error curves converge toward the asymptotic full sample baseline.
Figure 5: Franka arm tracking error versus training samples (left) and latent dimension (right). Auxiliary losses (+Both) yield consistent performance boosts.
Figure 6: G1 performance scales positively with sample size (left) and latent capacity (right). Combined auxiliary objectives (+Both) achieve the highest survival rates.
...and 7 more figures

Theorems & Definitions (13)

proposition 1: Sampling Error
proof
Remark 1
lemma 1: Matrix Bernstein Inequality tropp2012user
proposition 2: Projection Error
proof
Theorem 1: Error Bound
Corollary 1: Scaling Law
Corollary 2: Asymptotic Convergence
proof
...and 3 more

Scaling Law of Neural Koopman Operators

TL;DR

Abstract

Scaling Law of Neural Koopman Operators

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (13)