Deep Bootstrap

Jinyuan Chang; Yuling Jiao; Lican Kang; Junjie Shi

Deep Bootstrap

Jinyuan Chang, Yuling Jiao, Lican Kang, Junjie Shi

TL;DR

The paper tackles uncertainty quantification in nonparametric regression by marrying conditional diffusion modeling with bootstrap resampling. It learns the conditional distribution $P_{\mathbf{Y}|\mathbf{X}}$ with a variance-preserving diffusion model, uses generated samples to form a regression estimator $\widehat{f}$, and then constructs bootstrap replicas to obtain $\widehat{f}^*$ and confidence intervals. The authors derive sharp end-to-end convergence rates in Wasserstein distance for the conditional diffusion model and establish bootstrap consistency and coverage guarantees, supported by rigorous proofs and numerical experiments. Empirically, the method demonstrates accurate interval coverage and scalability to higher-dimensional covariates, validating the practical utility of integrating diffusion-based generative modeling into bootstrap inference. The framework offers a principled, unified approach to both sampling from complex conditional distributions and nonparametric estimation, with potential extensions to time series and broader nonparametric settings.

Abstract

In this work, we propose a novel deep bootstrap framework for nonparametric regression based on conditional diffusion models. Specifically, we construct a conditional diffusion model to learn the distribution of the response variable given the covariates. This model is then used to generate bootstrap samples by pairing the original covariates with newly synthesized responses. We reformulate nonparametric regression as conditional sample mean estimation, which is implemented directly via the learned conditional diffusion model. Unlike traditional bootstrap methods that decouple the estimation of the conditional distribution, sampling, and nonparametric regression, our approach integrates these components into a unified generative framework. With the expressive capacity of diffusion models, our method facilitates both efficient sampling from high-dimensional or multimodal distributions and accurate nonparametric estimation. We establish rigorous theoretical guarantees for the proposed method. In particular, we derive optimal end-to-end convergence rates in the Wasserstein distance between the learned and target conditional distributions. Building on this foundation, we further establish the convergence guarantees of the resulting bootstrap procedure. Numerical studies demonstrate the effectiveness and scalability of our approach for complex regression tasks.

Deep Bootstrap

TL;DR

The paper tackles uncertainty quantification in nonparametric regression by marrying conditional diffusion modeling with bootstrap resampling. It learns the conditional distribution

with a variance-preserving diffusion model, uses generated samples to form a regression estimator

, and then constructs bootstrap replicas to obtain

and confidence intervals. The authors derive sharp end-to-end convergence rates in Wasserstein distance for the conditional diffusion model and establish bootstrap consistency and coverage guarantees, supported by rigorous proofs and numerical experiments. Empirically, the method demonstrates accurate interval coverage and scalability to higher-dimensional covariates, validating the practical utility of integrating diffusion-based generative modeling into bootstrap inference. The framework offers a principled, unified approach to both sampling from complex conditional distributions and nonparametric estimation, with potential extensions to time series and broader nonparametric settings.

Abstract

Paper Structure (38 sections, 30 theorems, 427 equations, 4 tables, 2 algorithms)

This paper contains 38 sections, 30 theorems, 427 equations, 4 tables, 2 algorithms.

Introduction
Contributions
Related Work
Preliminary
Outlines
Method
Conditional Diffusion Model
Bootstrap via Conditional Diffusion Model
Theory
Convergence of Conditional Diffusion Model
Bound $\mathbb{E}_{\mathcal{D},\mathcal{T},\mathcal{Z}}\mathbb{E}_{\mathbf{x}}[\mathcal{W}_2(\widetilde{p}_T^B(\cdot|\mathbf{x}),p_T^B(\cdot|\mathbf{x}))]$
Bound $\mathbb{E}_{\mathbf{x}}[\mathcal{W}_2(p_T^B(\cdot|\mathbf{x}), p_0(\cdot|\mathbf{x}))]$
Bound $\mathbb{E}_{\mathcal{D},\mathcal{T},\mathcal{Z}}\mathbb{E}_{\mathbf{x}}[\mathcal{W}_2(\widetilde{p}_T^B(\cdot|\mathbf{x}),p_0(\cdot|\mathbf{x}))]$
Convergence of Bootstrap
Numerical Experiments
...and 23 more sections

Key Result

Lemma 3.1

Suppose that Assumptions ass: bounded_density-ass: bounded_derivative hold. Let $\mathcal{M} \gg 1, C_T > 0$, and $T = \mathcal{M}^{-C_T}$. Then we can choose a ReLU neural network $\mathbf{b}\in\mathrm{NN}(L,M,J,\kappa)$ that satisfies for a constant $C_0 = \mathcal{O}(\sqrt{\log\mathcal{M}})$, and and has the following structure: Moreover, for any $t\in[\mathcal{M}^{-C_T}, 1 - \mathcal{M}^{-C

Theorems & Definitions (58)

Definition 1.1: ReLU DNNs
Definition 1.2: Wasserstein distance
Definition 1.3: Covering number
Definition 1.4: ($\beta$, $R$)-Hölder Class
Remark 3.1
Lemma 3.1: Approximation Error
Remark 3.2
Lemma 3.2: Statistical Error
Theorem 3.3: Error Bound for Conditional Score Estimation
Theorem 3.4
...and 48 more

Deep Bootstrap

TL;DR

Abstract

Deep Bootstrap

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (58)