Generative modeling for the bootstrap

Leon Tran; Ting Ye; Peng Ding; Fang Han

Generative modeling for the bootstrap

Leon Tran, Ting Ye, Peng Ding, Fang Han

TL;DR

Generative modeling-based bootstrap learns a generator ${\widehat{\bm{G}}_n}$ to map noise ${\bm{U}}$ to synthetic data that approximate the unknown distribution ${\mathrm P}_Z$, unifying classical, parametric, and smoothed bootstrap concepts. The authors prove bootstrap consistency for regular M-estimators and for irregular estimators such as isotonic regression under broad data, noise, and generator assumptions, establishing that conditional bootstrap distributions converge to the same limits as the original estimators. Concrete instantiations via GANs (e.g., W-GAN) and flow-based models (affine autoregressive flows) are shown to satisfy the framework’s assumptions, with flow bootstraps offering stronger guarantees for irregular problems. Simulation results compare original, smoothed, GAN, and flow bootstraps on OLS and isotonic regression, demonstrating that GAN and flow bootstraps can match or exceed the original bootstrap’s performance and are more robust to high dimensionality than kernel-based smoothing, highlighting practical applicability in challenging inferential settings.

Abstract

Generative modeling builds on and substantially advances the classical idea of simulating synthetic data from observed samples. This paper shows that this principle is not only natural but also theoretically well-founded for bootstrap inference: it yields statistically valid confidence intervals that apply simultaneously to both regular and irregular estimators, including settings in which Efron's bootstrap fails. In this sense, the generative modeling-based bootstrap can be viewed as a modern version of the smoothed bootstrap: it could mitigate the curse of dimensionality and remain effective in challenging regimes where estimators may lack root-$n$ consistency or a Gaussian limit.

Generative modeling for the bootstrap

TL;DR

Generative modeling-based bootstrap learns a generator

to map noise

to synthetic data that approximate the unknown distribution

, unifying classical, parametric, and smoothed bootstrap concepts. The authors prove bootstrap consistency for regular M-estimators and for irregular estimators such as isotonic regression under broad data, noise, and generator assumptions, establishing that conditional bootstrap distributions converge to the same limits as the original estimators. Concrete instantiations via GANs (e.g., W-GAN) and flow-based models (affine autoregressive flows) are shown to satisfy the framework’s assumptions, with flow bootstraps offering stronger guarantees for irregular problems. Simulation results compare original, smoothed, GAN, and flow bootstraps on OLS and isotonic regression, demonstrating that GAN and flow bootstraps can match or exceed the original bootstrap’s performance and are more robust to high dimensionality than kernel-based smoothing, highlighting practical applicability in challenging inferential settings.

Abstract

consistency or a Gaussian limit.

Paper Structure (27 sections, 39 theorems, 296 equations, 2 tables)

This paper contains 27 sections, 39 theorems, 296 equations, 2 tables.

Introduction
Generative modeling-based bootstrap
A general framework
Examples
Discussion
Theory for regular M-estimators
Theory for isotonic regression: an irregular estimator
GAN and flow bootstraps
W-GAN
Affine autoregressive flows
Simulation
Methods and implementation
Regular estimator: ordinary least squares
Isotonic Regression
Proofs of main theorems
...and 12 more sections

Key Result

Theorem 3.1

Under Assumptions assump_ndata-assump_bsmest, we have

Theorems & Definitions (62)

Definition 2.1: Neural networks
Example 2.1: Wasserstein GAN-based generative models, arjovsky2017wasserstein
Definition 2.2: Bijective monotone upper triangular functions
Definition 2.3: Affine autoregressive flows
Example 2.2: Affine autoregressive flow-based generative models
Theorem 3.1: Bootstrap consistency, regular M-estimators
Remark 3.1
Theorem 4.1: Bootstrap consistency, isotonic regression
Theorem 5.1: GAN bootstrap
Theorem 5.2: Flow bootstrap
...and 52 more

Generative modeling for the bootstrap

TL;DR

Abstract

Generative modeling for the bootstrap

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (62)