Table of Contents
Fetching ...

Mean square error analysis of stochastic gradient and variance-reduced sampling algorithms

Jianfeng Lu, Xuda Ye, Zhennan Zhou

TL;DR

This work develops a discrete Poisson equation framework to rigorously bound mean square error (MSE) for stochastic-gradient sampling of underdamped Langevin dynamics under global convexity. It proves a first-order convergence for the numerical bias of SG-UBU, with a leading coefficient tied to the stochastic gradient variance, and reveals a phase transition for variance-reduced variants SVRG-UBU and SAGA-UBU, where the bias shifts to second-order as the step size becomes small enough. An empirical criterion is provided to guide the choice between SG-UBU and SVRG-UBU to optimize computational efficiency, supported by numerical experiments. Overall, the discrete Poisson approach offers sharp, modular MSE bounds by separately bounding stability, local errors, and variance-reduction effects, with practical implications for scalable Bayesian sampling and data-assimilation tasks.

Abstract

This paper considers mean square error (MSE) analysis for stochastic gradient sampling algorithms applied to underdamped Langevin dynamics under a global convexity assumption. A novel discrete Poisson equation framework is developed to bound the time-averaged sampling error. For the Stochastic Gradient UBU (SG-UBU) sampler, we derive an explicit MSE bound and establish that the numerical bias exhibits first-order convergence with respect to the step size $h$, with the leading error coefficient proportional to the variance of the stochastic gradient. The analysis is further extended to variance-reduced algorithms for finite-sum potentials, specifically the SVRG-UBU and SAGA-UBU methods. For these algorithms, we identify a phase transition phenomenon whereby the convergence rate of the numerical bias shifts from first to second order as the step size decreases below a critical threshold. Theoretical findings are validated by numerical experiments. In addition, the analysis provides a practical empirical criterion for selecting between the mini-batch SG-UBU and SVRG-UBU samplers to achieve optimal computational efficiency.

Mean square error analysis of stochastic gradient and variance-reduced sampling algorithms

TL;DR

This work develops a discrete Poisson equation framework to rigorously bound mean square error (MSE) for stochastic-gradient sampling of underdamped Langevin dynamics under global convexity. It proves a first-order convergence for the numerical bias of SG-UBU, with a leading coefficient tied to the stochastic gradient variance, and reveals a phase transition for variance-reduced variants SVRG-UBU and SAGA-UBU, where the bias shifts to second-order as the step size becomes small enough. An empirical criterion is provided to guide the choice between SG-UBU and SVRG-UBU to optimize computational efficiency, supported by numerical experiments. Overall, the discrete Poisson approach offers sharp, modular MSE bounds by separately bounding stability, local errors, and variance-reduction effects, with practical implications for scalable Bayesian sampling and data-assimilation tasks.

Abstract

This paper considers mean square error (MSE) analysis for stochastic gradient sampling algorithms applied to underdamped Langevin dynamics under a global convexity assumption. A novel discrete Poisson equation framework is developed to bound the time-averaged sampling error. For the Stochastic Gradient UBU (SG-UBU) sampler, we derive an explicit MSE bound and establish that the numerical bias exhibits first-order convergence with respect to the step size , with the leading error coefficient proportional to the variance of the stochastic gradient. The analysis is further extended to variance-reduced algorithms for finite-sum potentials, specifically the SVRG-UBU and SAGA-UBU methods. For these algorithms, we identify a phase transition phenomenon whereby the convergence rate of the numerical bias shifts from first to second order as the step size decreases below a critical threshold. Theoretical findings are validated by numerical experiments. In addition, the analysis provides a practical empirical criterion for selecting between the mini-batch SG-UBU and SVRG-UBU samplers to achieve optimal computational efficiency.

Paper Structure

This paper contains 55 sections, 31 theorems, 406 equations, 6 figures, 4 tables, 4 algorithms.

Key Result

Theorem 1

Under Assumptions asP and asT, let the step size $h\leqslant \frac{1}{4}$; the discrete Poisson solution $\phi_h(\bm x,\bm v)$ has a uniformly bounded gradient and Hessian matrix: where the constant $C$ depends only on $(M_i)_{i=1}^3$ and $(L_i)_{i=1}^2$.

Figures (6)

  • Figure 1: Graphs of the 1D potential function $U(x)$ and the test function $f(x)$.
  • Figure 2: Log-log plot of the SG-UBU numerical bias $\pi_h(f) - \pi(f)$ vs. step size $h$ for the 1D potential. The computed bias (blue solid line) is compared to the theoretical linear approximation (red dashed line). (Left) Gaussian noise case. (Right) Finite-sum case.
  • Figure 3: Graphs of the 2D potential function $U(x_1,x_2)$ and the test function $f(x_1,x_2)$.
  • Figure 4: Log-log plot of the SG-UBU numerical bias $\pi_h(f) - \pi(f)$ vs. step size $h$ for the 2D potential. The computed bias (blue solid line) is compared to the theoretical linear approximation (red dashed line). (Left) Gaussian noise case. (Right) Finite-sum case.
  • Figure 5: Log-log plot of the sampling error vs. step size $h$ for SG-UBU, SVRG-UBU, and SAGA-UBU for different numbers of components $N\in\{10,50,100,500\}$.
  • ...and 1 more figures

Theorems & Definitions (59)

  • Theorem 1
  • Lemma 2
  • Theorem 3
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Theorem 7
  • proof : Sketch of Proof
  • Theorem 8
  • Lemma 10
  • ...and 49 more