Table of Contents
Fetching ...

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines

TL;DR

A unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities under Markovian noise is presented, which covers scenarios for both non-convex and strongly convex minimization problems.

Abstract

This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

TL;DR

A unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities under Markovian noise is presented, which covers scenarios for both non-convex and strongly convex minimization problems.

Abstract

This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.
Paper Structure (21 sections, 33 theorems, 197 equations, 2 tables, 3 algorithms)

This paper contains 21 sections, 33 theorems, 197 equations, 2 tables, 3 algorithms.

Key Result

Lemma 1

Assume as:Markov_noise_UGE and as:bounded_markov_noise_UGE. Then, for any $n \geq 1$ and $x \in \mathbb{R}^{d}$, it holds that Moreover, for any initial distribution $\xi$ on $(\mathsf{Z},\mathcal{Z})$, that where $C_{1} = 16(1 + \frac{1}{\ln^{2}{4}})$.

Theorems & Definitions (50)

  • Lemma 1
  • proof
  • Lemma 2
  • Theorem 1
  • Corollary 1
  • Theorem 2
  • Proposition 1
  • Proposition 2
  • Theorem 3
  • Corollary 2
  • ...and 40 more