High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Wanrong Zhu; Zhipeng Lou; Ziyang Wei; Wei Biao Wu

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Wanrong Zhu, Zhipeng Lou, Ziyang Wei, Wei Biao Wu

TL;DR

This work addresses uncertainty quantification for online stochastic optimization by introducing a parallel-run inference framework that builds a $t$-based confidence interval for any linear functional $\upsilon^{\top}x^{*}$ using $K$ independent runs. The method updates a parallel average $\bar{x}_{K,n}$ and a variance surrogate $\widehat{\sigma}_{\upsilon}^{2}$ to form $\widehat{CI}_{\upsilon}=\left[\upsilon^{\top}\bar{x}_{K,n}-\frac{t_{1-\alpha/2,K-1}\widehat{\sigma}_{\upsilon}}{\sqrt{K}},\ \upsilon^{\top}\bar{x}_{K,n}+\frac{t_{1-\alpha/2,K-1}\widehat{\sigma}_{\upsilon}}{\sqrt{K}}\right]$, with the corresponding $t$-statistic converging to $t_{K-1}$. The theoretical core is a Gaussian approximation for online estimators that yields explicit rates for the relative coverage error $\Delta_{\alpha}$ and shows the $t$-statistic is asymptotically pivotal, enabling valid high-confidence inference in online settings. Empirical results on linear and logistic regression, plus a MNIST-based mean image task, demonstrate accurate coverage, competitive interval lengths, and substantial computational savings from the near 'cost-free' inference, especially when leveraging parallel computing. The approach is easily integrated into existing stochastic algorithms and is well-suited to large-scale, streaming, or federated contexts where parallelism is natural.

Abstract

Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution information and construct a t-based confidence interval. Our method requires minimal additional computation and memory beyond the standard updating of estimates, making the inference process almost cost-free. We provide a rigorous theoretical guarantee for the confidence interval, demonstrating that the coverage is approximately exact with an explicit convergence rate and allowing for high confidence level inference. In particular, a new Gaussian approximation result is developed for the online estimators to characterize the coverage properties of our confidence intervals in terms of relative errors. Additionally, our method also allows for leveraging parallel computing to further accelerate calculations using multiple cores. It is easy to implement and can be integrated with existing stochastic algorithms without the need for complicated modifications.

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

TL;DR

This work addresses uncertainty quantification for online stochastic optimization by introducing a parallel-run inference framework that builds a

-based confidence interval for any linear functional

using

independent runs. The method updates a parallel average

and a variance surrogate

to form

, with the corresponding

-statistic converging to

. The theoretical core is a Gaussian approximation for online estimators that yields explicit rates for the relative coverage error

and shows the

-statistic is asymptotically pivotal, enabling valid high-confidence inference in online settings. Empirical results on linear and logistic regression, plus a MNIST-based mean image task, demonstrate accurate coverage, competitive interval lengths, and substantial computational savings from the near 'cost-free' inference, especially when leveraging parallel computing. The approach is easily integrated into existing stochastic algorithms and is well-suited to large-scale, streaming, or federated contexts where parallelism is natural.

Abstract

Paper Structure (14 sections, 2 theorems, 28 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 2 theorems, 28 equations, 9 figures, 3 tables, 1 algorithm.

Introduction
Background: existing confidence interval construction
Contribution
Inference with parallel runs of stochastic algorithms
Parallel computing
Asymptotic $t$-distribution
Theoretical guarantee
Convergence characterization for ASGD
Main results
Experiment
Simulation
Hand-written digit dataset
Discussion
Additional numerical results

Key Result

Theorem 1

Assume that $\{x_{i}\}_{i=1}^{n}$ is a SGD sequence defined by: where $\eta_{i} = \eta \times i^{-\beta}$ for some constant $\beta\in(1/2, 1)$. Let $\bar{x}_{n} = n^{-1} \sum_{i=1}^{n}x_{i}$. Under Assumptions Assumption_convex_F--Assumption_Hessian_Lip, on a sufficiently rich probability space, there exist a random vector $W_{n} \overset{\mathcal{D}}{=} \sqrt{n

Figures (9)

Figure 1: Realizations of parallel computing and inference.
Figure 2: Effect of $K$. Plot (a): relative error of coverage; plot (b): the length of confidence interval. The nominal coverage probability is 0.99. The total sample size $N$ is $60000$ for linear models and $200000$ for logistic models.
Figure 3: Linear Regression $d = 20$: Left: relative error of coverage; Middle: empirical coverage; Right: length of confidence intervals.
Figure 4: Logistic Regression $d =20$: Left: relative error of coverage; Middle: empirical coverage; Right: length of confidence intervals.
Figure 5: Computation time: d = 20
...and 4 more figures

Theorems & Definitions (6)

Remark 1: Almost cost-free
Remark 2: Choice of $K$
Theorem 1
Remark 3
Remark 4
Theorem 2

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

TL;DR

Abstract

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (6)