Table of Contents
Fetching ...

Scalable Subsampling Inference for Deep Neural Networks

Kejin Wu, Dimitris N. Politis

TL;DR

The paper tackles scalable, inference‑ready deep neural network regression by integrating scalable subsampling with subagging to form a subbagged DNN estimator that can be trained efficiently on large datasets. It strengthens theory by improving existing non‑asymptotic bounds through latest DNN approximation results and proves that the subagging approach achieves faster convergence under mild regularity, with a tunable bias regime. It then develops confidence and prediction interval methods based on CLT and iterated subsampling to handle potential bias, and demonstrates asymptotic validity along with finite‑sample enhancements. Extensive simulations show substantial computational savings, competitive point estimation accuracy, and CI/PI procedures that perform well in finite samples, supporting practical deployment of scalable, inference‑ready DNNs.

Abstract

Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest results on the approximation ability of DNN. More importantly, however, a non-random subsampling technique--scalable subsampling--is applied to construct a `subagged' DNN estimator. Under regularity conditions, it is shown that the subagged DNN estimator is computationally efficient without sacrificing accuracy for either estimation or prediction tasks. Beyond point estimation/prediction, we propose different approaches to build confidence and prediction intervals based on the subagged DNN estimator. In addition to being asymptotically valid, the proposed confidence/prediction intervals appear to work well in finite samples. All in all, the scalable subsampling DNN estimator offers the complete package in terms of statistical inference, i.e., (a) computational efficiency; (b) point estimation/prediction accuracy; and (c) allowing for the construction of practically useful confidence and prediction intervals.

Scalable Subsampling Inference for Deep Neural Networks

TL;DR

The paper tackles scalable, inference‑ready deep neural network regression by integrating scalable subsampling with subagging to form a subbagged DNN estimator that can be trained efficiently on large datasets. It strengthens theory by improving existing non‑asymptotic bounds through latest DNN approximation results and proves that the subagging approach achieves faster convergence under mild regularity, with a tunable bias regime. It then develops confidence and prediction interval methods based on CLT and iterated subsampling to handle potential bias, and demonstrates asymptotic validity along with finite‑sample enhancements. Extensive simulations show substantial computational savings, competitive point estimation accuracy, and CI/PI procedures that perform well in finite samples, supporting practical deployment of scalable, inference‑ready DNNs.

Abstract

Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest results on the approximation ability of DNN. More importantly, however, a non-random subsampling technique--scalable subsampling--is applied to construct a `subagged' DNN estimator. Under regularity conditions, it is shown that the subagged DNN estimator is computationally efficient without sacrificing accuracy for either estimation or prediction tasks. Beyond point estimation/prediction, we propose different approaches to build confidence and prediction intervals based on the subagged DNN estimator. In addition to being asymptotically valid, the proposed confidence/prediction intervals appear to work well in finite samples. All in all, the scalable subsampling DNN estimator offers the complete package in terms of statistical inference, i.e., (a) computational efficiency; (b) point estimation/prediction accuracy; and (c) allowing for the construction of practically useful confidence and prediction intervals.
Paper Structure (14 sections, 4 theorems, 51 equations, 1 figure, 7 tables, 3 algorithms)

This paper contains 14 sections, 4 theorems, 51 equations, 1 figure, 7 tables, 3 algorithms.

Key Result

Theorem 4.1

Under assumptions A1 to A3, width $H = \Theta(n^{\frac{d}{2(\xi+\alpha)}} \log n)$, and depth $L = \Theta( \log n$). Then, the $L_2$ norm loss of the deep fully connected ReLU network estimator Eq:widehatDNN can be bounded with probability at least $1-\exp \left(-n^{\frac{d}{\xi+d}} \log ^6 n\right here $C_2>0$ is another constant.

Figures (1)

  • Figure 1: The illustration of a fully connected DNN with $L = 2$, $H = 4$ and $W = 37$, and input dimension $d = 2$ and output dimension 1.

Theorems & Definitions (17)

  • Example 3.1: Kernel-smoothed function estimation
  • Remark
  • Theorem 4.1
  • Remark
  • Remark 4.1
  • Theorem 4.2
  • Remark 4.2
  • Example 4.1: Types of CI
  • Remark 4.3: The condition to guarantee $C_{\mu} = 0$
  • Remark 4.4: Iterated subsampling
  • ...and 7 more