Table of Contents
Fetching ...

Bootstrap SGD: Algorithmic Stability and Robustness

Andreas Christmann, Yunwen Lei

Abstract

In this paper some methods to use the empirical bootstrap approach for stochastic gradient descent (SGD) to minimize the empirical risk over a separable Hilbert space are investigated from the view point of algorithmic stability and statistical robustness. The first two types of approaches are based on averages and are investigated from a theoretical point of view. A generalization analysis for bootstrap SGD of Type 1 and Type 2 based on algorithmic stability is done. Another type of bootstrap SGD is proposed to demonstrate that it is possible to construct purely distribution-free pointwise confidence intervals of the median curve using bootstrap SGD.

Bootstrap SGD: Algorithmic Stability and Robustness

Abstract

In this paper some methods to use the empirical bootstrap approach for stochastic gradient descent (SGD) to minimize the empirical risk over a separable Hilbert space are investigated from the view point of algorithmic stability and statistical robustness. The first two types of approaches are based on averages and are investigated from a theoretical point of view. A generalization analysis for bootstrap SGD of Type 1 and Type 2 based on algorithmic stability is done. Another type of bootstrap SGD is proposed to demonstrate that it is possible to construct purely distribution-free pointwise confidence intervals of the median curve using bootstrap SGD.
Paper Structure (13 sections, 11 theorems, 78 equations, 2 figures, 2 tables)

This paper contains 13 sections, 11 theorems, 78 equations, 2 figures, 2 tables.

Key Result

Lemma 1

Let $A$ be an algorithm. Let $G,L,\epsilon>0$.

Figures (2)

  • Figure 1: Behavior of Bootstrap Method of Type 2, $n=1000$ and $B=101$. The shaded scattered points show the training examples, the blue plot shows the true function, the cyan line shows the SGD approximation of the true data, the pink area shows the pointwise $0.95$-confidence intervals, and the red line shows the average of the bootstrap approximations.
  • Figure 2: Behavior of Bootstrap Method of Type 3, $n=1000$ and $B=101$. The shaded scattered points show the training examples, the blue plot shows the true function, the cyan line shows the SGD approximation of the true data, the pink area shows the pointwise $0.95$-confidence intervals for the median, and the red line shows the median of the bootstrap approximations.

Theorems & Definitions (32)

  • Definition 1: Uniform stability bousquet2002stability
  • Definition 2: Argument stability
  • Definition 3
  • Example 1
  • Lemma 1: Stability and Generalization lei2020fine
  • Definition 4: Bootstrap method
  • Definition 5: Bootstrap SGD
  • Lemma 2
  • Theorem 3: Stability bounds
  • Remark 1
  • ...and 22 more