Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry

Vladimir Solodkin; Andrew Veprikov; Aleksandr Beznosikov

Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry

Vladimir Solodkin, Andrew Veprikov, Aleksandr Beznosikov

TL;DR

This work tackles optimization under Markovian stochasticity in arbitrary non-Euclidean geometry by extending Mirror Descent and Mirror-Prox frameworks to stochastic settings with Markov noise. It introduces two algorithms, Markovian Accelerated Mirror Descent (MAMD) and Markovian Mirror-Prox (MMP), and analyzes them under general norms and Bregman divergences, with both batching and non-batching gradient estimators. The results yield accelerated convergence rates that depend on the mixing time $\tau_{\text{mix}}$, along with tight lower bounds showing near-optimality for first-order methods in these Markovian settings. A novel deviation bound for geometrically ergodic Markov chains and discussions on oracle complexity further establish the practicality and theoretical sharpness of the proposed approaches. Overall, the paper advances stochastic optimization in non-Euclidean spaces by delivering optimal rates, robust variance handling, and rigorous VI analysis under Markovian noise.

Abstract

This paper examines a variety of classical optimization problems, including well-known minimization tasks and more general variational inequalities. We consider a stochastic formulation of these problems, and unlike most previous work, we take into account the complex Markov nature of the noise. We also consider the geometry of the problem in an arbitrary non-Euclidean setting, and propose four methods based on the Mirror Descent iteration technique. Theoretical analysis is provided for smooth and convex minimization problems and variational inequalities with Lipschitz and monotone operators. The convergence guarantees obtained are optimal for first-order stochastic methods, as evidenced by the lower bound estimates provided in this paper.

Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry

TL;DR

, along with tight lower bounds showing near-optimality for first-order methods in these Markovian settings. A novel deviation bound for geometrically ergodic Markov chains and discussions on oracle complexity further establish the practicality and theoretical sharpness of the proposed approaches. Overall, the paper advances stochastic optimization in non-Euclidean spaces by delivering optimal rates, robust variance handling, and rigorous VI analysis under Markovian noise.

Abstract

Paper Structure (16 sections, 16 theorems, 129 equations, 4 algorithms)

This paper contains 16 sections, 16 theorems, 129 equations, 4 algorithms.

Introduction
Our contribution
Main results
Markovian Accelerated Mirror Descent
Markovian Mirror-Prox
Auxiliary Lemmas and Facts
Convexity of the squared norm
Cauchy–Schwarz inequality
Fenchel-Young inequality
Bregman Divergence properties
Proofs of results for MAMD without batching (Algorithm \ref{['alg:MDGwb']})
Proof of Lemmas \ref{['lem:xuivjopeentertainment']} and \ref{['lem:expect_bound_grad']}
Proofs of results for MAMD with batching (Algorithm \ref{['alg:MDG']})
Proofs of results for MMP without batching (Algorithm \ref{['alg:MPGwb']})
Proofs of results for MMP with batching (Algorithm \ref{['alg:MPG']})
...and 1 more sections

Key Result

Theorem 1

Let Assumptions as:lip, as:conv, as:noise, as:var be satisfied. Let the problem eq:problem1 be solved by Algorithm alg:MDGwb. Assume that the stepsizes $\gamma_t$ and momentums $\beta_t$ are chosen such that $0 \leq (\beta_{t+1}-1)\gamma_{t+1} \leq \beta_t\gamma_t$, $\beta_t \geq 2 \gamma_t L$ for a

Theorems & Definitions (29)

Theorem 1: Convergence of MAMD without batching (Algorithm \ref{['alg:MDGwb']})
Corollary 1: Parameters tuning for Theorem \ref{['theorem:MDGwb']}
Lemma 1
Lemma 2
Theorem 2: Convergence of MAMD with batching (Algorithm \ref{['alg:MDG']})
Corollary 2: Parameters tuning for Theorem \ref{['theorem:MDG']}
Proposition 1: Lower bound for \ref{['eq:problem1']}
Theorem 3: Convergence of MMP without batching (Algorithm \ref{['alg:MPGwb']})
Corollary 3: Parameters tuning for Theorem \ref{['theorem:MPGwb']}
Theorem 4: Convergence of MMP
...and 19 more

Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry

TL;DR

Abstract

Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (29)