Learning linear dynamical systems under convex constraints
Hemant Tyagi, Denis Efimov
TL;DR
This work develops non-asymptotic Frobenius-norm guarantees for identifying a strictly stable linear dynamical system under a convex constraint $A^*\in\mathcal{K}$. By leveraging a tangent-cone geometry and Talagrand's $\gamma$-functionals, the authors bound the constrained LS estimator error in terms of local complexity at $A^*$, and provide two main theorems that apply with or without a small tangent cone. They instantiate the general results for four structured settings—subspace, sparsity, convex regression, and Lipschitz regression—showing substantially lower sample-size requirements than unconstrained OLS in many regimes. The analysis relies on advanced concentration tools for second-order subgaussian chaos and yields concrete, interpretable sample-complexity formulas that capture the benefit of incorporating known structure into LDS identification. This framework thus enables reliable finite-time system identification in scenarios with limited data by exploiting convex structural information in $A^*$.
Abstract
We consider the problem of finite-time identification of linear dynamical systems from $T$ samples of a single trajectory. Recent results have predominantly focused on the setup where either no structural assumption is made on the system matrix $A^* \in \mathbb{R}^{n \times n}$, or specific structural assumptions (e.g. sparsity) are made on $A^*$. We assume prior structural information on $A^*$ is available, which can be captured in the form of a convex set $\mathcal{K}$ containing $A^*$. For the solution of the ensuing constrained least squares estimator, we derive non-asymptotic error bounds in the Frobenius norm that depend on the local size of $\mathcal{K}$ at $A^*$. To illustrate the usefulness of these results, we instantiate them for four examples, namely when (i) $A^*$ is sparse and $\mathcal{K}$ is a suitably scaled $\ell_1$ ball; (ii) $\mathcal{K}$ is a subspace; (iii) $\mathcal{K}$ consists of matrices each of which is formed by sampling a bivariate convex function on a uniform $n \times n$ grid (convex regression); (iv) $\mathcal{K}$ consists of matrices each row of which is formed by uniform sampling (with step size $1/T$) of a univariate Lipschitz function. In all these situations, we show that $A^*$ can be reliably estimated for values of $T$ much smaller than what is needed for the unconstrained setting.
