An Online Bootstrap for Time Series

Nicolai Palm; Thomas Nagler

An Online Bootstrap for Time Series

Nicolai Palm, Thomas Nagler

TL;DR

An Online Bootstrap for Time Series develops an online bootstrap that preserves dependence in streaming data by using autoregressive resampling weights. The method forms $X_i^*=\frac{V_i}{\overline{V}_n}X_i$ with $V_i$ following an autoregressive rule $V_i=1+\rho_i(V_{i-1}-1)+\sqrt{1-\rho_i^2}\zeta_i$ and $\rho_i=1-i^{-\beta}$, enabling cheap online updates and consistent uncertainty quantification. The authors establish asymptotic validity under stationarity and $\alpha$-mixing, identify the optimal $\beta_{opt}=\sqrt{2}-1$, and show a convergence rate of $\mathcal{O}(n^{-{\beta}/{(1+\beta)}})$ for variance estimation. Through simulations on iid, MA, nonlinear, and GARCH-type processes, the AR-bootstrap demonstrates reliable coverage and competitive computation time versus block-based methods, while also extending to transformed statistics and multivariate settings via the delta method. The framework offers a practical tool for online uncertainty quantification in ML tasks such as empirical risk minimization and bandit algorithms, bridging classical resampling with modern streaming data needs.

Abstract

Resampling methods such as the bootstrap have proven invaluable in the field of machine learning. However, the applicability of traditional bootstrap methods is limited when dealing with large streams of dependent data, such as time series or spatially correlated observations. In this paper, we propose a novel bootstrap method that is designed to account for data dependencies and can be executed online, making it particularly suitable for real-time applications. This method is based on an autoregressive sequence of increasingly dependent resampling weights. We prove the theoretical validity of the proposed bootstrap scheme under general conditions. We demonstrate the effectiveness of our approach through extensive simulations and show that it provides reliable uncertainty quantification even in the presence of complex data dependencies. Our work bridges the gap between classical resampling techniques and the demands of modern data analysis, providing a valuable tool for researchers and practitioners in dynamic, data-rich environments.

An Online Bootstrap for Time Series

TL;DR

An Online Bootstrap for Time Series develops an online bootstrap that preserves dependence in streaming data by using autoregressive resampling weights. The method forms

with

following an autoregressive rule

and

, enabling cheap online updates and consistent uncertainty quantification. The authors establish asymptotic validity under stationarity and

-mixing, identify the optimal

, and show a convergence rate of

for variance estimation. Through simulations on iid, MA, nonlinear, and GARCH-type processes, the AR-bootstrap demonstrates reliable coverage and competitive computation time versus block-based methods, while also extending to transformed statistics and multivariate settings via the delta method. The framework offers a practical tool for online uncertainty quantification in ML tasks such as empirical risk minimization and bandit algorithms, bridging classical resampling with modern streaming data needs.

Abstract

Paper Structure (33 sections, 11 theorems, 124 equations, 4 figures, 1 algorithm)

This paper contains 33 sections, 11 theorems, 124 equations, 4 figures, 1 algorithm.

Introduction
Background and related work
Online learning
Bootstrapping
Time series bootstrap
New bootstrap procedure
Proposed method
Theory
Beyond the simple sample average
Transformed random variables.
Multidimensional vectors.
Transformations of the sample average.
Numerical validation
Experimental design
Data generating processes.
...and 18 more sections

Key Result

Lemma 3.4

If the time series $X_1, X_2, \ldots \in \mathds{R}$ satisfies assumptions A1--A3, it holds

Figures (4)

Figure 1: Estimated coverage probability of the bootstrap procedures. The target level of $90\%$ is shown as solid line.
Figure 2: Average plus/minus standard deviation of the estimated variances. The target level $\sigma_\infty$ is shown as solid line.
Figure 3: Estimated coverage probability of the bootstrap procedures with target level of $90\%$ shown as solid line (top) and average plus/minus standard deviation of the estimated variances with target level $\sigma_\infty$ shown as solid line (bottom).
Figure 4: Computation time per online update of 200 bootstrap samples as the algorithms progress through a stream of samples.

Theorems & Definitions (25)

Example 2.1: Empirical bootstrap
Example 2.2: Multiplier bootstrap
Definition 2.3: Bootstrap consistency
Definition 3.2: Stationarity
Definition 3.3: $\alpha$-mixing
Lemma 3.4
Theorem 3.5
Theorem 3.6
Corollary 3.7
Corollary 3.8
...and 15 more

An Online Bootstrap for Time Series

TL;DR

Abstract

An Online Bootstrap for Time Series

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (25)