Table of Contents
Fetching ...

Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds

Yuyang Zhang, Shahriar Talebi, Na Li

TL;DR

The paper addresses learning latent low-dimensional dynamics from high-dimensional time-series observations in partially observed LTI systems. It introduces a two-stage Col-ALGO that first estimates the observer column space and then performs low-dimensional SYSID on projected data, achieving a non-asymptotic sample complexity of $\tilde{O}([n+\text{poly}(r,m)]/\varepsilon^2)$ and proving a near-tight lower bound of $\Omega(n/\varepsilon^2)$. A key insight is that the observer-space learning error dominates the sample complexity, motivating a meta-learning extension where a shared observer across multiple systems is learned collectively, enabling breaking the bound in certain regimes via a leave-one-out strategy. The work provides detailed probabilistic analyses, including a relative subspace perturbation bound tailored for dynamical data, and demonstrates practical gains through simulations showing improved performance over standard Ho-Kalman in high-dimensional settings and effective meta-learning gains. Overall, the paper advances non-asymptotic guarantees for subspace-based SYSID and offers a scalable meta-learning framework for high-dimensional, sensor-rich applications such as neuroscience.

Abstract

In this paper, we focus on learning a linear time-invariant (LTI) model with low-dimensional latent variables but high-dimensional observations. We provide an algorithm that recovers the high-dimensional features, i.e. column space of the observer, embeds the data into low dimensions and learns the low-dimensional model parameters. Our algorithm enjoys a sample complexity guarantee of order $\tilde{\mathcal{O}}(n/ε^2)$, where $n$ is the observation dimension. We further establish a fundamental lower bound indicating this complexity bound is optimal up to logarithmic factors and dimension-independent constants. We show that this inevitable linear factor of $n$ is due to the learning error of the observer's column space in the presence of high-dimensional noises. Extending our results, we consider a meta-learning problem inspired by various real-world applications, where the observer column space can be collectively learned from datasets of multiple LTI systems. An end-to-end algorithm is then proposed, facilitating learning LTI systems from a meta-dataset which breaks the sample complexity lower bound in certain scenarios.

Learning Low-dimensional Latent Dynamics from High-dimensional Observations: Non-asymptotics and Lower Bounds

TL;DR

The paper addresses learning latent low-dimensional dynamics from high-dimensional time-series observations in partially observed LTI systems. It introduces a two-stage Col-ALGO that first estimates the observer column space and then performs low-dimensional SYSID on projected data, achieving a non-asymptotic sample complexity of and proving a near-tight lower bound of . A key insight is that the observer-space learning error dominates the sample complexity, motivating a meta-learning extension where a shared observer across multiple systems is learned collectively, enabling breaking the bound in certain regimes via a leave-one-out strategy. The work provides detailed probabilistic analyses, including a relative subspace perturbation bound tailored for dynamical data, and demonstrates practical gains through simulations showing improved performance over standard Ho-Kalman in high-dimensional settings and effective meta-learning gains. Overall, the paper advances non-asymptotic guarantees for subspace-based SYSID and offers a scalable meta-learning framework for high-dimensional, sensor-rich applications such as neuroscience.

Abstract

In this paper, we focus on learning a linear time-invariant (LTI) model with low-dimensional latent variables but high-dimensional observations. We provide an algorithm that recovers the high-dimensional features, i.e. column space of the observer, embeds the data into low dimensions and learns the low-dimensional model parameters. Our algorithm enjoys a sample complexity guarantee of order , where is the observation dimension. We further establish a fundamental lower bound indicating this complexity bound is optimal up to logarithmic factors and dimension-independent constants. We show that this inevitable linear factor of is due to the learning error of the observer's column space in the presence of high-dimensional noises. Extending our results, we consider a meta-learning problem inspired by various real-world applications, where the observer column space can be collectively learned from datasets of multiple LTI systems. An end-to-end algorithm is then proposed, facilitating learning LTI systems from a meta-dataset which breaks the sample complexity lower bound in certain scenarios.
Paper Structure (29 sections, 25 theorems, 231 equations, 1 figure, 3 algorithms)

This paper contains 29 sections, 25 theorems, 231 equations, 1 figure, 3 algorithms.

Key Result

Theorem 3.4

Consider system $\calM$ and datasets $\calD_1=\calU_1\cup\calY_1, \calD_2=\calU_2\cup\calY_2$ (with lengths $T_1$, $T_2$ respectively) in hdsysid. Suppose $\calM$ satisfies Assumption assmp:sys1. If then ($\htA,\htB,\htC$) from alg:1single satisfy the following for some invertible matrix $S$ with probability at least $1-\delta$ Here $\kappa_1=\kappa_1(\widehat{\calM}, \calU_2,\delta), \kappa_2=\k

Figures (1)

  • Figure 1: Left: Error for Col-Approx (Algorithm \ref{['alg:column_space']}). Center: Error for col-algo (Algorithm \ref{['alg:1single']}) and Standard Ho-Kalman with $n=320$. Right: Error for meta-algo (Algorithm \ref{['alg:2meta']}) and Standard Ho-Kalman for $n\in[80,320]$.

Theorems & Definitions (51)

  • Definition 3.2: oracle
  • Remark 3.3: Why two trajectories?
  • Theorem 3.4
  • Remark 3.5
  • Corollary 3.6
  • Lemma 3.7
  • proof : Proof Sketch of \ref{['lem:1col']}
  • Theorem 4.1
  • Theorem 5.1
  • Theorem 1.1: \ref{['thm:1single']} Restated
  • ...and 41 more