Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

Kaizhao Liu; Jose Blanchet; Lexing Ying; Yiping Lu

Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

Kaizhao Liu, Jose Blanchet, Lexing Ying, Yiping Lu

TL;DR

This paper introduces Orthogonal Bootstrap, a debiasing strategy for bootstrap-based uncertainty quantification under input uncertainty. By decomposing the bootstrap target into a non-orthogonal part captured by the influence function (via Infinitesimal Jackknife) and an orthogonal remainder, the method uses the non-orthogonal component as a control variate and simulates only the orthogonal part, dramatically reducing the required Monte Carlo replications. The authors prove that, under mild smoothness assumptions in a kernel mean embedding (Fréchet derivative) framework, the simulation variance drops from $O_p(1/n^{1+ abla})$ for standard bootstrap to $O_p(1/n^{2+ abla})$ for the orthogonal method, enabling $O(1)$ replications. They also extend the approach to variance estimation and provide extensive numerical experiments on debiasing, confidence/prediction interval construction, and real-data prediction, showing improved accuracy and competitiveness with much lower computational cost. The work offers a practical, theoretically grounded path to fast, reliable uncertainty quantification for large-scale applications in statistics and ML.

Abstract

Bootstrap is a popular methodology for simulating input uncertainty. However, it can be computationally expensive when the number of samples is large. We propose a new approach called \textbf{Orthogonal Bootstrap} that reduces the number of required Monte Carlo replications. We decomposes the target being simulated into two parts: the \textit{non-orthogonal part} which has a closed-form result known as Infinitesimal Jackknife and the \textit{orthogonal part} which is easier to be simulated. We theoretically and numerically show that Orthogonal Bootstrap significantly reduces the computational cost of Bootstrap while improving empirical accuracy and maintaining the same width of the constructed interval.

Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

TL;DR

for standard bootstrap to

for the orthogonal method, enabling

replications. They also extend the approach to variance estimation and provide extensive numerical experiments on debiasing, confidence/prediction interval construction, and real-data prediction, showing improved accuracy and competitiveness with much lower computational cost. The work offers a practical, theoretically grounded path to fast, reliable uncertainty quantification for large-scale applications in statistics and ML.

Abstract

Paper Structure (38 sections, 26 theorems, 125 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 38 sections, 26 theorems, 125 equations, 4 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Contribution
Organization of the Paper
Notation
Debiasing via Orthogonal Bootstrap
Provable Improvement of Orthogonal Bootstrap
Informal Proof:
Variance Estimation via Orthogonal Bootstrap
Numerical Examples
Debiasing
Function of Mean
Entropy
Constrained Optimization Problem
Confidence Interval Construction
...and 23 more sections

Key Result

Theorem 2.1

Let $X_{ob}$ be the Orthogonal Bootstrap estimator defined in Equation eq:debiasestimator and $X_{sb}$ be the Standard Bootstrap estimator defined by $X_{sb}:=2\hat{\phi}-\frac{1}{B}\sum_{b=1}^B \hat{\phi}^b$. Under Assumption assumption: rkhs embed and Assumption assumption: lip in rkhs, if the num

Figures (4)

Figure 1: We consider modeling the relationship between resampled distribution and simulation output as nuisance estimation in orthogonal statistical learning. In Orthogonal Bootstrap, we use linear modeling for the nuisance estimation and only focus on the simulation of the orthogonal part (i.e. the residual of linear modeling) to reduce the simulation error.
Figure 2: Orthogonal Bootstrap can significantly reduce the simulation output for the examples shown in ma2022correcting when the number of Monte Carlo replications is limited. The $x$-axis represents the number of Monte Carlo replications and $y$-axis denotes the bias produced by the estimation. The shaded area represents the 90% quantile interval for repeated simulations. Orthogonal Bootstrap can significantly reduce the simulation error.
Figure 3: Orthogonal Bootstrap can significantly reduce the simulation output for the examples shown in ma2022correcting when the number of Bootstrap resampling is limited. The $x$-axis represents the time of Bootstrap resampling and $y$-axis denotes the root mean square error produced by the estimation. The shaded area represents the 80% quantile interval for repeated simulations. Orthogonal Bootstrap can significantly reduce the simulation variance.
Figure 4: Comparison of indirect and direct influence function calculation in the constrained optimization problem.

Theorems & Definitions (44)

Theorem 2.1
Remark 2.2
Lemma 3.1
Theorem 3.2
Remark 3.3
Lemma 1.1: Section 6.3.2 Lemma A, serfling2009approximation
proof
Lemma 1.2: Section 6.3.2 Lemma B, serfling2009approximation
proof
Lemma 1.3
...and 34 more

Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

TL;DR

Abstract

Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (44)