Table of Contents
Fetching ...

Sample Complexity of the Sign-Perturbed Sums Identification Method: Scalar Case

Szabolcs Szentpéteri, Balázs Csanád Csáji

TL;DR

The paper addresses the non-asymptotic sample complexity of the Sign-Perturbed Sums (SPS) method for finite-sample, distribution-free identification in the scalar linear regression setting. It develops distribution-free, high-probability bounds showing that SPS confidence intervals shrink at a geometric rate when noises are subgaussian, and it extends these results to the SPS outer approximation. The authors provide three theoretical scenarios—constant-in-noise, bounded-regressor, and unbounded-regressor—and supply concentration bounds for each, along with simulations that corroborate the theoretical rates and demonstrate the bounds' practical relevance. The work offers a rigorous stepping stone toward understanding SPS in multidimensional settings and has implications for related areas such as bandits and prediction intervals in signal processing.

Abstract

Sign-Perturbed Sum (SPS) is a powerful finite-sample system identification algorithm which can construct confidence regions for the true data generating system with exact coverage probabilities, for any finite sample size. SPS was developed in a series of papers and it has a wide range of applications, from general linear systems, even in a closed-loop setup, to nonlinear and nonparametric approaches. Although several theoretical properties of SPS were proven in the literature, the sample complexity of the method was not analysed so far. This paper aims to fill this gap and provides the first results on the sample complexity of SPS. Here, we focus on scalar linear regression problems, that is we study the behaviour of SPS confidence intervals. We provide high probability upper bounds, under three different sets of assumptions, showing that the sizes of SPS confidence intervals shrink at a geometric rate around the true parameter, if the observation noises are subgaussian. We also show that similar bounds hold for the previously proposed outer approximation of the confidence region. Finally, we present simulation experiments comparing the theoretical and the empirical convergence rates.

Sample Complexity of the Sign-Perturbed Sums Identification Method: Scalar Case

TL;DR

The paper addresses the non-asymptotic sample complexity of the Sign-Perturbed Sums (SPS) method for finite-sample, distribution-free identification in the scalar linear regression setting. It develops distribution-free, high-probability bounds showing that SPS confidence intervals shrink at a geometric rate when noises are subgaussian, and it extends these results to the SPS outer approximation. The authors provide three theoretical scenarios—constant-in-noise, bounded-regressor, and unbounded-regressor—and supply concentration bounds for each, along with simulations that corroborate the theoretical rates and demonstrate the bounds' practical relevance. The work offers a rigorous stepping stone toward understanding SPS in multidimensional settings and has implications for related areas such as bandits and prediction intervals in signal processing.

Abstract

Sign-Perturbed Sum (SPS) is a powerful finite-sample system identification algorithm which can construct confidence regions for the true data generating system with exact coverage probabilities, for any finite sample size. SPS was developed in a series of papers and it has a wide range of applications, from general linear systems, even in a closed-loop setup, to nonlinear and nonparametric approaches. Although several theoretical properties of SPS were proven in the literature, the sample complexity of the method was not analysed so far. This paper aims to fill this gap and provides the first results on the sample complexity of SPS. Here, we focus on scalar linear regression problems, that is we study the behaviour of SPS confidence intervals. We provide high probability upper bounds, under three different sets of assumptions, showing that the sizes of SPS confidence intervals shrink at a geometric rate around the true parameter, if the observation noises are subgaussian. We also show that similar bounds hold for the previously proposed outer approximation of the confidence region. Finally, we present simulation experiments comparing the theoretical and the empirical convergence rates.
Paper Structure (12 sections, 58 equations, 3 figures, 2 tables)

This paper contains 12 sections, 58 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: SPS confidence intervals, $m=2$ ($50\,\%$ confidence).
  • Figure 2: Comparison of the empirical size and the theoretical upper bound on the size in the constant identification case for $m=2, \delta=0.1, n=400, k =1000$.
  • Figure 3: Comparison of the empirical size and the corresponding theoretical bound for scalar linear regression with Gaussian regressors, $m=2, n=400, k =1000$.