Table of Contents
Fetching ...

An Online Bootstrap for Time Series

Nicolai Palm, Thomas Nagler

TL;DR

An Online Bootstrap for Time Series develops an online bootstrap that preserves dependence in streaming data by using autoregressive resampling weights. The method forms $X_i^*=\frac{V_i}{\overline{V}_n}X_i$ with $V_i$ following an autoregressive rule $V_i=1+\rho_i(V_{i-1}-1)+\sqrt{1-\rho_i^2}\zeta_i$ and $\rho_i=1-i^{-\beta}$, enabling cheap online updates and consistent uncertainty quantification. The authors establish asymptotic validity under stationarity and $\alpha$-mixing, identify the optimal $\beta_{opt}=\sqrt{2}-1$, and show a convergence rate of $\mathcal{O}(n^{-{\beta}/{(1+\beta)}})$ for variance estimation. Through simulations on iid, MA, nonlinear, and GARCH-type processes, the AR-bootstrap demonstrates reliable coverage and competitive computation time versus block-based methods, while also extending to transformed statistics and multivariate settings via the delta method. The framework offers a practical tool for online uncertainty quantification in ML tasks such as empirical risk minimization and bandit algorithms, bridging classical resampling with modern streaming data needs.

Abstract

Resampling methods such as the bootstrap have proven invaluable in the field of machine learning. However, the applicability of traditional bootstrap methods is limited when dealing with large streams of dependent data, such as time series or spatially correlated observations. In this paper, we propose a novel bootstrap method that is designed to account for data dependencies and can be executed online, making it particularly suitable for real-time applications. This method is based on an autoregressive sequence of increasingly dependent resampling weights. We prove the theoretical validity of the proposed bootstrap scheme under general conditions. We demonstrate the effectiveness of our approach through extensive simulations and show that it provides reliable uncertainty quantification even in the presence of complex data dependencies. Our work bridges the gap between classical resampling techniques and the demands of modern data analysis, providing a valuable tool for researchers and practitioners in dynamic, data-rich environments.

An Online Bootstrap for Time Series

TL;DR

An Online Bootstrap for Time Series develops an online bootstrap that preserves dependence in streaming data by using autoregressive resampling weights. The method forms with following an autoregressive rule and , enabling cheap online updates and consistent uncertainty quantification. The authors establish asymptotic validity under stationarity and -mixing, identify the optimal , and show a convergence rate of for variance estimation. Through simulations on iid, MA, nonlinear, and GARCH-type processes, the AR-bootstrap demonstrates reliable coverage and competitive computation time versus block-based methods, while also extending to transformed statistics and multivariate settings via the delta method. The framework offers a practical tool for online uncertainty quantification in ML tasks such as empirical risk minimization and bandit algorithms, bridging classical resampling with modern streaming data needs.

Abstract

Resampling methods such as the bootstrap have proven invaluable in the field of machine learning. However, the applicability of traditional bootstrap methods is limited when dealing with large streams of dependent data, such as time series or spatially correlated observations. In this paper, we propose a novel bootstrap method that is designed to account for data dependencies and can be executed online, making it particularly suitable for real-time applications. This method is based on an autoregressive sequence of increasingly dependent resampling weights. We prove the theoretical validity of the proposed bootstrap scheme under general conditions. We demonstrate the effectiveness of our approach through extensive simulations and show that it provides reliable uncertainty quantification even in the presence of complex data dependencies. Our work bridges the gap between classical resampling techniques and the demands of modern data analysis, providing a valuable tool for researchers and practitioners in dynamic, data-rich environments.
Paper Structure (33 sections, 11 theorems, 124 equations, 4 figures, 1 algorithm)

This paper contains 33 sections, 11 theorems, 124 equations, 4 figures, 1 algorithm.

Key Result

Lemma 3.4

If the time series $X_1, X_2, \ldots \in \mathds{R}$ satisfies assumptions A1--A3, it holds

Figures (4)

  • Figure 1: Estimated coverage probability of the bootstrap procedures. The target level of $90\%$ is shown as solid line.
  • Figure 2: Average plus/minus standard deviation of the estimated variances. The target level $\sigma_\infty$ is shown as solid line.
  • Figure 3: Estimated coverage probability of the bootstrap procedures with target level of $90\%$ shown as solid line (top) and average plus/minus standard deviation of the estimated variances with target level $\sigma_\infty$ shown as solid line (bottom).
  • Figure 4: Computation time per online update of 200 bootstrap samples as the algorithms progress through a stream of samples.

Theorems & Definitions (25)

  • Example 2.1: Empirical bootstrap
  • Example 2.2: Multiplier bootstrap
  • Definition 2.3: Bootstrap consistency
  • Definition 3.2: Stationarity
  • Definition 3.3: $\alpha$-mixing
  • Lemma 3.4
  • Theorem 3.5
  • Theorem 3.6
  • Corollary 3.7
  • Corollary 3.8
  • ...and 15 more