A stochastic gradient descent algorithm with random search directions

Eméric Gbaguidi

A stochastic gradient descent algorithm with random search directions

Eméric Gbaguidi

TL;DR

The paper tackles unconstrained finite-sum optimization and the cost of full gradient evaluations in high dimensions by introducing SCORS, a stochastic gradient method with random search directions, updating $X_{n+1}=X_n-\gamma_n D(V_{n+1})\nabla f_{U_{n+1}}(X_n)$ where $D(v)=vv^T$ and $E[D(V_{n+1})|\mathcal{F}_n]=\mathbf{I}_d$. Under mild smoothness and growth conditions, it proves almost-sure convergence with decreasing steps, establishes a central limit theorem with an asymptotic covariance $\Sigma=\int_0^\infty (e^{-(H-I_d/2)u})^T\Gamma e^{-(H-I_d/2)u}du$ and $\Gamma=\mathbb{E}[V V^T Q V V^T]$, and derives non-asymptotic $L^p$ rates $\mathbb{E}\|X_n-x^*\|^{2p}\le K_p/n^{p\alpha}$ for $\gamma_n=c/n^\alpha$, $\tfrac12<\alpha\le1$. Theoretical results depend on the search-direction distribution (uniform, non-uniform, Gaussian, spherical) via explicit $\Gamma$-forms, and numerical experiments on logistic regression corroborate the CLT and reveal practical performance trade-offs, with uniform directions often achieving the smallest asymptotic variance and superior per-iteration efficiency.

Abstract

Stochastic coordinate descent algorithms are efficient methods in which each iterate is obtained by fixing most coordinates at their values from the current iteration, and approximately minimizing the objective with respect to the remaining coordinates. However, this approach is usually restricted to canonical basis vectors of $\mathbb{R}^d$. In this paper, we develop a new class of stochastic gradient descent algorithms with random search directions which uses the directional derivative of the gradient estimate following more general random vectors. We establish the almost sure convergence of these algorithms with decreasing step. We further investigate their central limit theorem and pay particular attention to analyze the impact of the search distributions on the asymptotic covariance matrix. We also provide non-asymptotic $\mathbb{L}^p$ rates of convergence.

A stochastic gradient descent algorithm with random search directions

TL;DR

where

and

. Under mild smoothness and growth conditions, it proves almost-sure convergence with decreasing steps, establishes a central limit theorem with an asymptotic covariance

and

, and derives non-asymptotic

rates

for

. Theoretical results depend on the search-direction distribution (uniform, non-uniform, Gaussian, spherical) via explicit

-forms, and numerical experiments on logistic regression corroborate the CLT and reveal practical performance trade-offs, with uniform directions often achieving the smallest asymptotic variance and superior per-iteration efficiency.

Abstract

. In this paper, we develop a new class of stochastic gradient descent algorithms with random search directions which uses the directional derivative of the gradient estimate following more general random vectors. We establish the almost sure convergence of these algorithms with decreasing step. We further investigate their central limit theorem and pay particular attention to analyze the impact of the search distributions on the asymptotic covariance matrix. We also provide non-asymptotic

rates of convergence.

A stochastic gradient descent algorithm with random search directions

TL;DR

Abstract

A stochastic gradient descent algorithm with random search directions

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (12)