Table of Contents
Fetching ...

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

Simon Lacoste-Julien, Mark Schmidt, Francis Bach

TL;DR

The paper introduces a simple weighted averaging scheme for the projected stochastic subgradient method, using a weight of $t+1$ for each iterate. This technique, paired with a tailored step-size, yields an $O(1/t)$ convergence rate and avoids the logarithmic factor that appears in standard analyses. It provides an easy, online implementable averaging rule and validates its effectiveness through SVM-style experiments, showing competitive performance against established averaging strategies. The discussion situates the result within the broader landscape of averaging schemes, including polynomial-decay and suffix averaging, and derives a finite-variance bound for SVM within this framework.

Abstract

In this note, we present a new averaging technique for the projected stochastic subgradient method. By using a weighted average with a weight of t+1 for each iterate w_t at iteration t, we obtain the convergence rate of O(1/t) with both an easy proof and an easy implementation. The new scheme is compared empirically to existing techniques, with similar performance behavior.

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

TL;DR

The paper introduces a simple weighted averaging scheme for the projected stochastic subgradient method, using a weight of for each iterate. This technique, paired with a tailored step-size, yields an convergence rate and avoids the logarithmic factor that appears in standard analyses. It provides an easy, online implementable averaging rule and validates its effectiveness through SVM-style experiments, showing competitive performance against established averaging strategies. The discussion situates the result within the broader landscape of averaging schemes, including polynomial-decay and suffix averaging, and derives a finite-variance bound for SVM within this framework.

Abstract

In this note, we present a new averaging technique for the projected stochastic subgradient method. By using a weighted average with a weight of t+1 for each iterate w_t at iteration t, we obtain the convergence rate of O(1/t) with both an easy proof and an easy implementation. The new scheme is compared empirically to existing techniques, with similar performance behavior.

Paper Structure

This paper contains 8 sections, 16 equations, 1 figure.

Figures (1)

  • Figure 1: Comparison of optimization strategies for support vector machine objective. Top from to right: quantum, protein, and sido data sets. Bottom from left to right: rcv1, covertype, and news data sets. This figure is best viewed in colour.