Table of Contents
Fetching ...

A Tutorial on the Non-Asymptotic Theory of System Identification

Ingvar Ziemann, Anastasios Tsiamis, Bruce Lee, Yassir Jedra, Nikolai Matni, George J. Pappas

TL;DR

The paper develops a non-asymptotic, finite-sample theory for linear system identification, delivering concrete high-probability error bounds for least-squares estimators in ARX and state-space models. Central to the approach are tools from high-dimensional probability, including the Hanson-Wright inequality, covering arguments, and self-normalized martingale bounds, which together yield rates scaling like $\\varepsilon \\propto \\text{(noise)} \\\sqrt{(d+\\\log(1/\\delta))/T}$ under appropriate persistency of excitation. A key methodological theme is the basic inequality, which remains valid beyond linear settings and enables sparsity-aware, bias-aware, and nonlinear extensions with finite-sample guarantees. The results illuminate the roles of burn-in, excitation, the smallest eigenvalue of empirical covariances, and SNR in determining feasible sample sizes and rates, while providing streamlined proofs and a unified toolkit for finite-sample analysis. The framework extends to nonlinear regression with realizability assumptions, offering a path toward finite-sample guarantees in broader identification problems with multiple trajectories or finite hypothesis classes. Overall, the tutorial bridges control theory with high-dimensional statistics to yield practical, provable finite-sample guarantees for system identification tasks.

Abstract

This tutorial serves as an introduction to recently developed non-asymptotic methods in the theory of -- mainly linear -- system identification. We emphasize tools we deem particularly useful for a range of problems in this domain, such as the covering technique, the Hanson-Wright Inequality and the method of self-normalized martingales. We then employ these tools to give streamlined proofs of the performance of various least-squares based estimators for identifying the parameters in autoregressive models. We conclude by sketching out how the ideas presented herein can be extended to certain nonlinear identification problems.

A Tutorial on the Non-Asymptotic Theory of System Identification

TL;DR

The paper develops a non-asymptotic, finite-sample theory for linear system identification, delivering concrete high-probability error bounds for least-squares estimators in ARX and state-space models. Central to the approach are tools from high-dimensional probability, including the Hanson-Wright inequality, covering arguments, and self-normalized martingale bounds, which together yield rates scaling like under appropriate persistency of excitation. A key methodological theme is the basic inequality, which remains valid beyond linear settings and enables sparsity-aware, bias-aware, and nonlinear extensions with finite-sample guarantees. The results illuminate the roles of burn-in, excitation, the smallest eigenvalue of empirical covariances, and SNR in determining feasible sample sizes and rates, while providing streamlined proofs and a unified toolkit for finite-sample analysis. The framework extends to nonlinear regression with realizability assumptions, offering a path toward finite-sample guarantees in broader identification problems with multiple trajectories or finite hypothesis classes. Overall, the tutorial bridges control theory with high-dimensional statistics to yield practical, provable finite-sample guarantees for system identification tasks.

Abstract

This tutorial serves as an introduction to recently developed non-asymptotic methods in the theory of -- mainly linear -- system identification. We emphasize tools we deem particularly useful for a range of problems in this domain, such as the covering technique, the Hanson-Wright Inequality and the method of self-normalized martingales. We then employ these tools to give streamlined proofs of the performance of various least-squares based estimators for identifying the parameters in autoregressive models. We conclude by sketching out how the ideas presented herein can be extended to certain nonlinear identification problems.
Paper Structure (69 sections, 43 theorems, 275 equations)

This paper contains 69 sections, 43 theorems, 275 equations.

Key Result

Lemma 2.1

Let $X$ be a nonnegative random variable. For every $s > 0$ we have that

Theorems & Definitions (72)

  • Lemma 2.1
  • proof
  • Corollary 2.1: Chernoff
  • proof
  • Definition 2.1
  • Theorem 2.1: hanson1971boundrudelson2013hanson
  • Definition 2.2
  • Lemma 2.2: Lemma 4.4.1 and Exercise 4.4.3 in vershynin_2018
  • Lemma 2.3: Corollary 4.2.13 in vershynin_2018
  • Lemma 2.4
  • ...and 62 more