A Tutorial on the Non-Asymptotic Theory of System Identification
Ingvar Ziemann, Anastasios Tsiamis, Bruce Lee, Yassir Jedra, Nikolai Matni, George J. Pappas
TL;DR
The paper develops a non-asymptotic, finite-sample theory for linear system identification, delivering concrete high-probability error bounds for least-squares estimators in ARX and state-space models. Central to the approach are tools from high-dimensional probability, including the Hanson-Wright inequality, covering arguments, and self-normalized martingale bounds, which together yield rates scaling like $\\varepsilon \\propto \\text{(noise)} \\\sqrt{(d+\\\log(1/\\delta))/T}$ under appropriate persistency of excitation. A key methodological theme is the basic inequality, which remains valid beyond linear settings and enables sparsity-aware, bias-aware, and nonlinear extensions with finite-sample guarantees. The results illuminate the roles of burn-in, excitation, the smallest eigenvalue of empirical covariances, and SNR in determining feasible sample sizes and rates, while providing streamlined proofs and a unified toolkit for finite-sample analysis. The framework extends to nonlinear regression with realizability assumptions, offering a path toward finite-sample guarantees in broader identification problems with multiple trajectories or finite hypothesis classes. Overall, the tutorial bridges control theory with high-dimensional statistics to yield practical, provable finite-sample guarantees for system identification tasks.
Abstract
This tutorial serves as an introduction to recently developed non-asymptotic methods in the theory of -- mainly linear -- system identification. We emphasize tools we deem particularly useful for a range of problems in this domain, such as the covering technique, the Hanson-Wright Inequality and the method of self-normalized martingales. We then employ these tools to give streamlined proofs of the performance of various least-squares based estimators for identifying the parameters in autoregressive models. We conclude by sketching out how the ideas presented herein can be extended to certain nonlinear identification problems.
