Learning Dynamical Systems by Leveraging Data from Similar Systems
Lei Xin, Lintao Ye, George Chiu, Shreyas Sundaram
TL;DR
This work studies finite-sample learning of a linear time-invariant system using data from a similar auxiliary system. It introduces a weighted least-squares framework that blends true- and auxiliary-system rollouts with a tunable weight $q$ and optional regularization $\lambda$, and it derives data-independent and data-dependent error bounds that decompose the estimation error into noise-driven and model-difference components. The results show that auxiliary data can reduce intrinsic noise error at the cost of bias from model mismatch, and they provide guidelines and computable bounds to select $q$ and $\lambda$. Through numerical experiments, the authors illustrate how trajectory lengths and $q$ affect performance across scenarios and demonstrate how the data-dependent bound can guide practical weight selection. Overall, the paper offers a principled transfer-learning-like approach for system identification with provable guarantees and practical guidance for leveraging related systems.
Abstract
We consider the problem of learning the dynamics of a linear system when one has access to data generated by an auxiliary system that shares similar (but not identical) dynamics, in addition to data from the true system. We use a weighted least squares approach, and provide finite sample error bounds of the learned model as a function of the number of samples and various system parameters from the two systems as well as the weight assigned to the auxiliary data. We show that the auxiliary data can help to reduce the intrinsic system identification error due to noise, at the price of adding a portion of error that is due to the differences between the two system models. We further provide a data-dependent bound that is computable when some prior knowledge about the systems, such as upper bounds on noise levels and model difference, is available. This bound can also be used to determine the weight that should be assigned to the auxiliary data during the model training stage.
