A Note on the Prediction-Powered Bootstrap

Tijana Zrnic

A Note on the Prediction-Powered Bootstrap

Tijana Zrnic

TL;DR

This note introduces PPBoot, a bootstrap-based approach to prediction-powered inference that applies to arbitrary estimation problems and relies on a single bootstrap to form confidence intervals. By constructing $\theta_b^* = \hat{\theta}(\tilde{X}^*, f(\tilde{X}^*)) + \hat{\theta}(X^*, Y^*) - \hat{\theta}(X^*, f(X^*))$ and using a percentile bootstrap, PPBoot delivers valid intervals without problem-specific asymptotic variance calculations, and can be asymptotically normal when $\hat{\theta}$ is. The paper extends PPBoot with power tuning (via $\lambda$) and cross-fitting (Cross-PPBoot) to boost power and applicability when no pre-trained model is available. Empirical results on Galaxy Zoo 2, AlphaFold, gene expression, and Census tasks show PPBoot often matches or beats PPI/PPI++ in interval width while maintaining proper coverage, and generally produces tighter intervals than classical CLT-based methods. Overall, PPBoot offers a simple, versatile framework for prediction-powered inference that broadens the range of problems amenable to robust, data-efficient uncertainty quantification.

Abstract

We introduce PPBoot: a bootstrap-based method for prediction-powered inference. PPBoot is applicable to arbitrary estimation problems and is very simple to implement, essentially only requiring one application of the bootstrap. Through a series of examples, we demonstrate that PPBoot often performs nearly identically to (and sometimes better than) the earlier PPI(++) method based on asymptotic normality$\unicode{x2013}$when the latter is applicable$\unicode{x2013}$without requiring any asymptotic characterizations. Given its versatility, PPBoot could simplify and expand the scope of application of prediction-powered inference to problems where central limit theorems are hard to prove.

A Note on the Prediction-Powered Bootstrap

TL;DR

and using a percentile bootstrap, PPBoot delivers valid intervals without problem-specific asymptotic variance calculations, and can be asymptotically normal when

is. The paper extends PPBoot with power tuning (via

) and cross-fitting (Cross-PPBoot) to boost power and applicability when no pre-trained model is available. Empirical results on Galaxy Zoo 2, AlphaFold, gene expression, and Census tasks show PPBoot often matches or beats PPI/PPI++ in interval width while maintaining proper coverage, and generally produces tighter intervals than classical CLT-based methods. Overall, PPBoot offers a simple, versatile framework for prediction-powered inference that broadens the range of problems amenable to robust, data-efficient uncertainty quantification.

Abstract

when the latter is applicable

without requiring any asymptotic characterizations. Given its versatility, PPBoot could simplify and expand the scope of application of prediction-powered inference to problems where central limit theorems are hard to prove.

Paper Structure (11 sections, 7 equations, 11 figures, 1 algorithm)

This paper contains 11 sections, 7 equations, 11 figures, 1 algorithm.

Introduction
Problem setup.
PPBoot
Applications
Galaxies.
AlphaFold.
Gene expression.
Census.
Extensions
Power-tuned PPBoot
Cross-PPBoot

Figures (11)

Figure 1: Classical inference, PPI, and PPBoot, applied to estimating the fraction of spiral galaxies from galaxy images.
Figure 2: Classical inference, PPI, and PPBoot, applied to estimating the odds ratio between protein phosphorylation and protein disorder with AlphaFold predictions.
Figure 3: Classical inference, PPI, and PPBoot, applied to estimating the median gene expression with transformer predictions.
Figure 4: Classical inference, PPI, and PPBoot, applied to estimating the relationship between age and income in US census data.
Figure 5: Classical inference, PPI, and PPBoot, applied to estimating the relationship between income and having health insurance in US census data.
...and 6 more figures

A Note on the Prediction-Powered Bootstrap

TL;DR

Abstract

A Note on the Prediction-Powered Bootstrap

Authors

TL;DR

Abstract

Table of Contents

Figures (11)