Scalable Subsampling Inference for Deep Neural Networks

Kejin Wu; Dimitris N. Politis

Scalable Subsampling Inference for Deep Neural Networks

Kejin Wu, Dimitris N. Politis

TL;DR

The paper tackles scalable, inference‑ready deep neural network regression by integrating scalable subsampling with subagging to form a subbagged DNN estimator that can be trained efficiently on large datasets. It strengthens theory by improving existing non‑asymptotic bounds through latest DNN approximation results and proves that the subagging approach achieves faster convergence under mild regularity, with a tunable bias regime. It then develops confidence and prediction interval methods based on CLT and iterated subsampling to handle potential bias, and demonstrates asymptotic validity along with finite‑sample enhancements. Extensive simulations show substantial computational savings, competitive point estimation accuracy, and CI/PI procedures that perform well in finite samples, supporting practical deployment of scalable, inference‑ready DNNs.

Abstract

Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest results on the approximation ability of DNN. More importantly, however, a non-random subsampling technique--scalable subsampling--is applied to construct a `subagged' DNN estimator. Under regularity conditions, it is shown that the subagged DNN estimator is computationally efficient without sacrificing accuracy for either estimation or prediction tasks. Beyond point estimation/prediction, we propose different approaches to build confidence and prediction intervals based on the subagged DNN estimator. In addition to being asymptotically valid, the proposed confidence/prediction intervals appear to work well in finite samples. All in all, the scalable subsampling DNN estimator offers the complete package in terms of statistical inference, i.e., (a) computational efficiency; (b) point estimation/prediction accuracy; and (c) allowing for the construction of practically useful confidence and prediction intervals.

Scalable Subsampling Inference for Deep Neural Networks

TL;DR

Abstract

Paper Structure (14 sections, 4 theorems, 51 equations, 1 figure, 7 tables, 3 algorithms)

This paper contains 14 sections, 4 theorems, 51 equations, 1 figure, 7 tables, 3 algorithms.

Introduction
Standard fully connected deep neural network
Scalable subsampling
Estimation inference with DNN
Scalable subagging DNN estimator
Estimation of the bias order of DNN estimator
Confidence intervals
PCI in the case where $C_{\mu} = 0$
PCI in the case where $C_{\mu} \neq 0$
Predictive inference with the DNN estimator
Simulations
Simulations on point estimations
Simulations for CI and PI
Conclusions

Key Result

Theorem 4.1

Under assumptions A1 to A3, width $H = \Theta(n^{\frac{d}{2(\xi+\alpha)}} \log n)$, and depth $L = \Theta( \log n$). Then, the $L_2$ norm loss of the deep fully connected ReLU network estimator Eq:widehatDNN can be bounded with probability at least $1-\exp \left(-n^{\frac{d}{\xi+d}} \log ^6 n\right here $C_2>0$ is another constant.

Figures (1)

Figure 1: The illustration of a fully connected DNN with $L = 2$, $H = 4$ and $W = 37$, and input dimension $d = 2$ and output dimension 1.

Theorems & Definitions (17)

Example 3.1: Kernel-smoothed function estimation
Remark
Theorem 4.1
Remark
Remark 4.1
Theorem 4.2
Remark 4.2
Example 4.1: Types of CI
Remark 4.3: The condition to guarantee $C_{\mu} = 0$
Remark 4.4: Iterated subsampling
...and 7 more

Scalable Subsampling Inference for Deep Neural Networks

TL;DR

Abstract

Scalable Subsampling Inference for Deep Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (17)