Table of Contents
Fetching ...

Statistical Efficiency of Single- and Multi-step Models for Forecasting and Control

Anne Somalwar, Bruce D. Lee, George J. Pappas, Nikolai Matni

Abstract

Compounding error, where small prediction mistakes accumulate over time, presents a major challenge in learning-based control. A common remedy is to train multi-step predictors directly instead of rolling out single-step models. However, it is unclear when the benefits of multi-step predictors outweigh the difficulty of learning a more complex model. We provide the first quantitative analysis of this trade-off for linear dynamical systems. We study three predictor classes: (i) single step models, (ii) multi-step models, and (iii) single step models trained with multi-step losses. We show that when the model class is well-specified and accurately captures the system dynamics, single-step models achieve the lowest asymptotic prediction error. On the other hand, when the model class is misspecified due to partial observability, direct multi-step predictors can significantly reduce bias and improve accuracy. We provide theoretical and empirical evidence that these trade-offs persist when predictors are used in closed-loop control.

Statistical Efficiency of Single- and Multi-step Models for Forecasting and Control

Abstract

Compounding error, where small prediction mistakes accumulate over time, presents a major challenge in learning-based control. A common remedy is to train multi-step predictors directly instead of rolling out single-step models. However, it is unclear when the benefits of multi-step predictors outweigh the difficulty of learning a more complex model. We provide the first quantitative analysis of this trade-off for linear dynamical systems. We study three predictor classes: (i) single step models, (ii) multi-step models, and (iii) single step models trained with multi-step losses. We show that when the model class is well-specified and accurately captures the system dynamics, single-step models achieve the lowest asymptotic prediction error. On the other hand, when the model class is misspecified due to partial observability, direct multi-step predictors can significantly reduce bias and improve accuracy. We provide theoretical and empirical evidence that these trade-offs persist when predictors are used in closed-loop control.
Paper Structure (28 sections, 15 theorems, 140 equations, 6 figures)

This paper contains 28 sections, 15 theorems, 140 equations, 6 figures.

Key Result

Proposition II.1

The reducible asymptotic error of the multi-step predictor $\hat{G}^{MS}_N$ is given by where $M_{MS} \in \mathbb{R}^{H \times H }$ is the matrix with entry $(i,j)$ given by $M_{MS}^{ij} = \mathop{\mathrm{\mathrm{tr}}}\nolimits(A^{{\left\vert i-j \right\vert}})$.

Figures (6)

  • Figure 1: Convergence of $N\mathop{\mathrm{\textbf{E}}}\nolimits[L(\hat{f}_H)]$ to the reducible prediction errors given in \ref{['prop: well specified multistep']} (multi-step predictor), \ref{['prop: single-step well specified']} (single-step rollout), and \ref{['prop: ss w/ ms loss well specified']} (intermediate predictor) for the system defined by \ref{['eq: numerical ex system well specified']} with $a = 0.5, 0.75,0.9$ (left to right) and horizon $H=5$.
  • Figure 2: Convergence of $\mathop{\mathrm{\textbf{E}}}\nolimits[L(\hat{f}_H)]$ to the irreducible prediction errors given in \ref{['prop: multi-step misspecified']} (multi-step predictor), \ref{['prop: single-step misspecified']} (single-step rollout), and \ref{['prop: single-step w/ multi-step loss misspecified']} (intermediate predictor) for the system defined by \ref{['eq: numerical ex system misspecified']} with $a = 0.5, 0.75,0.9$ (left to right) and horizon $H=5$.
  • Figure 3: Comparison of the bias from the single and multi-step estimators across different horizons for the system defined in Example \ref{['example']}.
  • Figure 4: Infinite-horizon LQR performance in the well-specified case.
  • Figure 5: Infinite-horizon LQR performance in the misspecified case. Closed loop spectral radius greater than 1 implies infinite LQR cost.
  • ...and 1 more figures

Theorems & Definitions (30)

  • Proposition II.1: Proposition III.1 of CDCPaper
  • proof
  • Proposition II.2: Proposition III.2 of CDCPaper
  • proof
  • Proposition II.3
  • proof
  • Proposition II.4
  • proof
  • Proposition III.1: Proposition IV.1 of CDCPaper
  • proof
  • ...and 20 more