Table of Contents
Fetching ...

Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimization Problem

Quan Zhou, Jakub Marecek

TL;DR

This work addresses the problem of properly learning linear dynamical systems with unknown hidden-state dimension from time-series observations. It recasts LDS identification as a non-commutative polynomial optimization problem (NCPOP) and leverages the Navascués-Pironio-Acín (NPA) hierarchy to obtain globally convergent SDP relaxations, with minimizer extraction via the Gelfand–Naimark–Segal (GNS) construction. The method accommodates unknown state dimension and noisy data, and provides convergence guarantees under an Archimedean quadratic module, while empirical tests show superior performance over standard baselines on synthetic and real data, aided by sparsity-exploiting SDP variants for scalability. The approach broadens the toolkit for system identification by delivering provable convergence in a non-convex, operator-valued setting and demonstrates practical viability through numerical experiments and runtime analyses. Overall, it offers a principled, dimension-agnostic framework for proper LDS learning with potential extensions to constrained or quantum-inspired problems.

Abstract

There has been much recent progress in forecasting the next observation of a linear dynamical system (LDS), which is known as the improper learning, as well as in the estimation of its system matrices, which is known as the proper learning of LDS. We present an approach to proper learning of LDS, which in spite of the non-convexity of the problem, guarantees global convergence of numerical solutions to a least-squares estimator. We present promising computational results.

Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimization Problem

TL;DR

This work addresses the problem of properly learning linear dynamical systems with unknown hidden-state dimension from time-series observations. It recasts LDS identification as a non-commutative polynomial optimization problem (NCPOP) and leverages the Navascués-Pironio-Acín (NPA) hierarchy to obtain globally convergent SDP relaxations, with minimizer extraction via the Gelfand–Naimark–Segal (GNS) construction. The method accommodates unknown state dimension and noisy data, and provides convergence guarantees under an Archimedean quadratic module, while empirical tests show superior performance over standard baselines on synthetic and real data, aided by sparsity-exploiting SDP variants for scalability. The approach broadens the toolkit for system identification by delivering provable convergence in a non-convex, operator-valued setting and demonstrates practical viability through numerical experiments and runtime analyses. Overall, it offers a principled, dimension-agnostic framework for proper LDS learning with potential extensions to constrained or quantum-inspired problems.

Abstract

There has been much recent progress in forecasting the next observation of a linear dynamical system (LDS), which is known as the improper learning, as well as in the estimation of its system matrices, which is known as the proper learning of LDS. We present an approach to proper learning of LDS, which in spite of the non-convexity of the problem, guarantees global convergence of numerical solutions to a least-squares estimator. We present promising computational results.

Paper Structure

This paper contains 29 sections, 11 theorems, 49 equations, 4 figures, 1 table.

Key Result

Theorem 2

For any observable linear system $(G,F,V,W)$, for any length $T$ of a time window, and any error $\epsilon > 0$, under Assumption Archimedean, there is a convex optimization problem whose objective function value is at most $\epsilon$ away from obj_B subject to (NFF_1--NFF_2). Furthermore, an estima

Figures (4)

  • Figure 1: Upper: The nrmse fitness values \ref{['NRMSE']} of $81$ experiments of our method at different combinations of noise standard deviations of process noise $W$ and observation noise $V$ and Lower: at different combinations of parameters $c_1$ and $c_2$. Both use the data generated from systems in \ref{['equ:LDS']}. Lighter colors indicate higher nrmse and thus better simulation performance.
  • Figure 2: The nrmse fitness values \ref{['NRMSE']} of our method compared to the leading system identification methods implemented in Matlab™ System Identification Toolbox™. Upper & middle: the mean (solid lines) and mean $\pm$ one standard deviations (dashed lines) of nrmse as standard deviation of both process noise and observation noise increasing in lockstep from $0.1$ to $0.9$. The time series used for simulation are generated from systems in \ref{['equ:LDS']} (upper) and higher differential order systems in \ref{['equ:LDS-higher']} (middle), with the dimensions $n$ of both systems being $2$. Lower: the mean (solid dots) and mean $\pm$ one standard deviations (vertical error bars) of nrmse at different dimensions $n$ of the underlying systems in \ref{['equ:LDS']}. Higher nrmse indicates better simulation performance.
  • Figure 3: Left: The time series of stock price (dark) for the 21$^\textrm{st}$-121$^\textrm{st}$ period used in arima_aaai, and the predicted outputs of our method (yellow) compared against "least squares auto" (blue) implemented in Matlab™ System Identification Toolbox™. The dimension $d$ of "least squares auto" is iterated from $1$ to the highest number of $4$. The percentages in legend are corresponding nrmse values of one-step predictions. Right: a zoom-in for the 66$^\textrm{th}$-101$^\textrm{st}$ period.
  • Figure 4: Left: The (solid or dashed) curves show the mean runtime of the SDP relaxation of the baseline "least squares auto" (blue), the TSSOS hierarchy (green) and the NPA hierarchy (yellow), at different moment orders $k$ or dimensions $d$. The mean $\pm$ one standard deviation of runtime is displayed by shaded error bands. Upper-right: The mean and mean $\pm$ one standard deviation of runtime of the SDP relaxation of TSSOS hierarchy at moment order $k=1$ and the "least squares auto" with dimension $d=1$. Lower-right: The red bars display the sparsity of NPA hierarchy of the experiment on stock-market data against the length of time window, by ratios of non-zero coefficients out of all coefficients in the SDP relaxations

Theorems & Definitions (16)

  • Theorem 2
  • proof
  • Definition 3: burgdorf2016optimization, Definition 1.50
  • Proposition 4: burgdorf2016optimization, Proposition 1.51
  • Lemma 6: helton2002positive, Lemma 2.1
  • proof
  • Proposition 7: burgdorf2016optimization, Propositions 1.16 and 3.10
  • Proposition 8: burgdorf2016optimization, Proposition 5.7
  • Corollary 10
  • Lemma 11: burgdorf2016optimization, Lemma 1.44
  • ...and 6 more