Table of Contents
Fetching ...

Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks

Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla, Yik Lun Kei, Zhi Zhang, Yanzhen Chen

TL;DR

The paper tackles conditional variance estimation and confidence interval construction in nonparametric regression using dense ReLU networks. It proposes a residual-based framework that yields non-asymptotic bounds under heteroscedastic and homoscedastic noise, relaxing sub-Gaussian assumptions to sub-Exponential. For ReLU estimators, the work derives non-asymptotic bounds for both the conditional mean $f^*(x)$ and the conditional variance $g^*(x)$, marking the first variance-estimation results for ReLU networks, and introduces a bootstrap-based confidence interval with guaranteed coverage. Empirical results on simulations and a real California housing dataset demonstrate improved variance estimation accuracy and reliable, efficient confidence intervals compared to strong baselines.

Abstract

This paper addresses the problems of conditional variance estimation and confidence interval construction in nonparametric regression using dense networks with the Rectified Linear Unit (ReLU) activation function. We present a residual-based framework for conditional variance estimation, deriving nonasymptotic bounds for variance estimation under both heteroscedastic and homoscedastic settings. We relax the sub-Gaussian noise assumption, allowing the proposed bounds to accommodate sub-Exponential noise and beyond. Building on this, for a ReLU neural network estimator, we derive non-asymptotic bounds for both its conditional mean and variance estimation, representing the first result for variance estimation using ReLU networks. Furthermore, we develop a ReLU network based robust bootstrap procedure (Efron, 1992) for constructing confidence intervals for the true mean that comes with a theoretical guarantee on the coverage, providing a significant advancement in uncertainty quantification and the construction of reliable confidence intervals in deep learning settings.

Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks

TL;DR

The paper tackles conditional variance estimation and confidence interval construction in nonparametric regression using dense ReLU networks. It proposes a residual-based framework that yields non-asymptotic bounds under heteroscedastic and homoscedastic noise, relaxing sub-Gaussian assumptions to sub-Exponential. For ReLU estimators, the work derives non-asymptotic bounds for both the conditional mean and the conditional variance , marking the first variance-estimation results for ReLU networks, and introduces a bootstrap-based confidence interval with guaranteed coverage. Empirical results on simulations and a real California housing dataset demonstrate improved variance estimation accuracy and reliable, efficient confidence intervals compared to strong baselines.

Abstract

This paper addresses the problems of conditional variance estimation and confidence interval construction in nonparametric regression using dense networks with the Rectified Linear Unit (ReLU) activation function. We present a residual-based framework for conditional variance estimation, deriving nonasymptotic bounds for variance estimation under both heteroscedastic and homoscedastic settings. We relax the sub-Gaussian noise assumption, allowing the proposed bounds to accommodate sub-Exponential noise and beyond. Building on this, for a ReLU neural network estimator, we derive non-asymptotic bounds for both its conditional mean and variance estimation, representing the first result for variance estimation using ReLU networks. Furthermore, we develop a ReLU network based robust bootstrap procedure (Efron, 1992) for constructing confidence intervals for the true mean that comes with a theoretical guarantee on the coverage, providing a significant advancement in uncertainty quantification and the construction of reliable confidence intervals in deep learning settings.
Paper Structure (26 sections, 18 theorems, 268 equations, 1 figure, 12 tables)

This paper contains 26 sections, 18 theorems, 268 equations, 1 figure, 12 tables.

Key Result

Theorem 1

[General mean estimation]. Suppose that $\bar{f} \in \mathcal{F}$ is such that so that $\phi_n$ is the approximating error. Suppose that $\mathcal{A}_n$ is chosen to satisfy Moreover, let $\mathcal{F}_{\mathcal{A}_n} \,:=\, \{ f_{\mathcal{A}_n}/\mathcal{A}_n \,:\, f \in \mathcal{F}\}$ and assume that for some decreasing function $\eta_n \,:\, (0,1) \rightarrow \mathbb{R}_{\geq 0}$. If for s

Figures (1)

  • Figure 1: The true variance function $g^*(\cdot)$ and the predicted $\hat{g}(\cdot)$ from the estimator based on the residuals. The rows from top to bottom refer to Scenario 1,2, and 3 with $x_i \in \mathbb{R}^2$. For visualization purposes, regions with variance greater than $1$ are colored in grey.

Theorems & Definitions (37)

  • Theorem 1
  • Definition 1: $(p, C)$-smoothness
  • Definition 2: Space of Hierarchical Composition Models, kohler2019rate
  • Theorem 2
  • Theorem 3
  • Remark 1
  • Corollary 1
  • Theorem 4
  • Corollary 2
  • Theorem 5
  • ...and 27 more