Table of Contents
Fetching ...

Generalized Smooth Stochastic Variational Inequalities: Almost Sure Convergence and Convergence Rates

Daniil Vankov, Angelia Nedich, Lalitha Sankar

TL;DR

The paper addresses stochastic variational inequalities with generalized smooth, $\alpha$-symmetric operators that are not necessarily monotone. It develops and analyzes clipped stochastic projection and clipped Korpelevich methods under $p$-quasi sharpness, proving almost-sure convergence without assuming bounded stochastic noise and deriving unbiased in-expectation rates for $\alpha\le\tfrac{1}{2}$. A key methodological contribution is the two-sample clipping approach, which decouples clipping from stochastic error and enables unbiased convergence analysis. The results show $\mathcal{O}(1/k)$ rates for $p=2$ and $\mathcal{O}(k^{-2(1-q)/p})$ rates for $p>2$ (with $\tfrac{1}{2}<q<1$), validated by numerical experiments on generalized smooth SVIs, and extend the theoretical toolkit for adversarial training and multi-agent learning contexts.

Abstract

This paper focuses on solving a stochastic variational inequality (SVI) problem under relaxed smoothness assumption for a class of structured non-monotone operators. The SVI problem has attracted significant interest in the machine learning community due to its immediate application to adversarial training and multi-agent reinforcement learning. In many such applications, the resulting operators do not satisfy the smoothness assumption. To address this issue, we focus on a weaker generalized smoothness assumption called $α$-symmetric. Under $p$-quasi sharpness and $α$-symmetric assumptions on the operator, we study clipped projection (gradient descent-ascent) and clipped Korpelevich (extragradient) methods. For these clipped methods, we provide the first almost-sure convergence results without making any assumptions on the boundedness of either the stochastic operator or the stochastic samples. We also provide the first in-expectation unbiased convergence rate results for these methods under a relaxed smoothness assumption for $α\leq \frac{1}{2}$.

Generalized Smooth Stochastic Variational Inequalities: Almost Sure Convergence and Convergence Rates

TL;DR

The paper addresses stochastic variational inequalities with generalized smooth, -symmetric operators that are not necessarily monotone. It develops and analyzes clipped stochastic projection and clipped Korpelevich methods under -quasi sharpness, proving almost-sure convergence without assuming bounded stochastic noise and deriving unbiased in-expectation rates for . A key methodological contribution is the two-sample clipping approach, which decouples clipping from stochastic error and enables unbiased convergence analysis. The results show rates for and rates for (with ), validated by numerical experiments on generalized smooth SVIs, and extend the theoretical toolkit for adversarial training and multi-agent learning contexts.

Abstract

This paper focuses on solving a stochastic variational inequality (SVI) problem under relaxed smoothness assumption for a class of structured non-monotone operators. The SVI problem has attracted significant interest in the machine learning community due to its immediate application to adversarial training and multi-agent reinforcement learning. In many such applications, the resulting operators do not satisfy the smoothness assumption. To address this issue, we focus on a weaker generalized smoothness assumption called -symmetric. Under -quasi sharpness and -symmetric assumptions on the operator, we study clipped projection (gradient descent-ascent) and clipped Korpelevich (extragradient) methods. For these clipped methods, we provide the first almost-sure convergence results without making any assumptions on the boundedness of either the stochastic operator or the stochastic samples. We also provide the first in-expectation unbiased convergence rate results for these methods under a relaxed smoothness assumption for .

Paper Structure

This paper contains 27 sections, 19 theorems, 201 equations, 4 figures, 1 table.

Key Result

Proposition 2.2

Let $U\subseteq \mathbb{R}^m$ be a nonempty convex set and let $F: U\to\mathbb{R}^m$ be an operator. Then, the following statements hold:

Figures (4)

  • Figure 1: Comparison of the clipped stochastic projection, same-sample projection, Korpelevich, and Popov methods with $\beta_k = 100/(100 + k^{1/2 + \epsilon})$.
  • Figure 2: Comparison of the clipped stochastic projection, same-sample projection, Korpelevich, and Popov methods with $\beta_k = 100/(100 + k^{1 - \epsilon})$.
  • Figure 3: Comparison of the clipped stochastic projection, same-sample projection, Korpelevich, and Popov methods with $\beta_k = 100/(100 + k^{1/2 + \epsilon})$ for averaged iterates.
  • Figure 4: Comparison of the clipped stochastic projection, same-sample projection, Korpelevich, and Popov methods with $\beta_k = 100/(100 + k^{1 - \epsilon})$ for averaged iterates.

Theorems & Definitions (33)

  • Proposition 2.2: DBLP:conf/icml/00020LL23, Proposition 1
  • Lemma 3.1
  • Theorem 3.2
  • Lemma 3.3
  • Theorem 3.4
  • Lemma 4.1
  • Theorem 4.2
  • Lemma 4.3
  • Theorem 4.4
  • Lemma A.1
  • ...and 23 more