A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

Simon Lacoste-Julien; Mark Schmidt; Francis Bach

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

Simon Lacoste-Julien, Mark Schmidt, Francis Bach

TL;DR

The paper introduces a simple weighted averaging scheme for the projected stochastic subgradient method, using a weight of $t+1$ for each iterate. This technique, paired with a tailored step-size, yields an $O(1/t)$ convergence rate and avoids the logarithmic factor that appears in standard analyses. It provides an easy, online implementable averaging rule and validates its effectiveness through SVM-style experiments, showing competitive performance against established averaging strategies. The discussion situates the result within the broader landscape of averaging schemes, including polynomial-decay and suffix averaging, and derives a finite-variance bound for SVM within this framework.

Abstract

In this note, we present a new averaging technique for the projected stochastic subgradient method. By using a weighted average with a weight of t+1 for each iterate w_t at iteration t, we obtain the convergence rate of O(1/t) with both an easy proof and an easy implementation. The new scheme is compared empirically to existing techniques, with similar performance behavior.

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

TL;DR

The paper introduces a simple weighted averaging scheme for the projected stochastic subgradient method, using a weight of

for each iterate. This technique, paired with a tailored step-size, yields an

convergence rate and avoids the logarithmic factor that appears in standard analyses. It provides an easy, online implementable averaging rule and validates its effectiveness through SVM-style experiments, showing competitive performance against established averaging strategies. The discussion situates the result within the broader landscape of averaging schemes, including polynomial-decay and suffix averaging, and derives a finite-variance bound for SVM within this framework.

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

TL;DR

Abstract

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)