Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds
Yuyang Zhang, Shahriar Talebi, Na Li
TL;DR
The paper addresses learning latent low-dimensional dynamics from high-dimensional time-series observations in partially observed LTI systems. It introduces a two-stage Col-ALGO that first estimates the observer column space and then performs low-dimensional SYSID on projected data, achieving a non-asymptotic sample complexity of $\tilde{O}([n+\text{poly}(r,m)]/\varepsilon^2)$ and proving a near-tight lower bound of $\Omega(n/\varepsilon^2)$. A key insight is that the observer-space learning error dominates the sample complexity, motivating a meta-learning extension where a shared observer across multiple systems is learned collectively, enabling breaking the bound in certain regimes via a leave-one-out strategy. The work provides detailed probabilistic analyses, including a relative subspace perturbation bound tailored for dynamical data, and demonstrates practical gains through simulations showing improved performance over standard Ho-Kalman in high-dimensional settings and effective meta-learning gains. Overall, the paper advances non-asymptotic guarantees for subspace-based SYSID and offers a scalable meta-learning framework for high-dimensional, sensor-rich applications such as neuroscience.
Abstract
In this paper, we focus on learning a linear time-invariant (LTI) model with low-dimensional latent variables but high-dimensional observations. We provide an algorithm that recovers the high-dimensional features, i.e. column space of the observer, embeds the data into low dimensions and learns the low-dimensional model parameters. Our algorithm enjoys a sample complexity guarantee of order $\tilde{\mathcal{O}}(n/ε^2)$, where $n$ is the observation dimension. We further establish a fundamental lower bound indicating this complexity bound is optimal up to logarithmic factors and dimension-independent constants. We show that this inevitable linear factor of $n$ is due to the learning error of the observer's column space in the presence of high-dimensional noises. Extending our results, we consider a meta-learning problem inspired by various real-world applications, where the observer column space can be collectively learned from datasets of multiple LTI systems. An end-to-end algorithm is then proposed, facilitating learning LTI systems from a meta-dataset which breaks the sample complexity lower bound in certain scenarios.
