Table of Contents
Fetching ...

Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees

Daniil Medyakov, Gleb Molodtsov, Grigoriy Evseev, Egor Petrov, Aleksandr Beznosikov

TL;DR

This work studies shuffling in finite-sum variational inequalities, addressing the lack of theoretical guarantees for permutation-based data processing. It introduces two Extragradient variants with shuffling (RR and SO) and augments them with variance reduction, providing the first convergence bounds for shuffling in VI and proving linear convergence in the VR setting. The theory is complemented by experiments on image denoising and adversarial training, where shuffling-based methods consistently outperform independent-index schemes, with Random Reshuffling often yielding the strongest gains. Overall, the results extend VI algorithmic theory to practical, large-scale finite-sum settings and demonstrate tangible improvements in convergence speed and robustness.

Abstract

Variational inequalities have gained significant attention in machine learning and optimization research. While stochastic methods for solving these problems typically assume independent data sampling, we investigate an alternative approach -- the shuffling heuristic. This strategy involves permuting the dataset before sequential processing, ensuring equal consideration of all data points. Despite its practical utility, theoretical guarantees for shuffling in variational inequalities remain unexplored. We address this gap by providing the first theoretical convergence estimates for shuffling methods in this context. Our analysis establishes rigorous bounds and convergence rates, extending the theoretical framework for this important class of algorithms. We validate our findings through extensive experiments on diverse benchmark variational inequality problems, demonstrating faster convergence of shuffling methods compared to independent sampling approaches.

Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees

TL;DR

This work studies shuffling in finite-sum variational inequalities, addressing the lack of theoretical guarantees for permutation-based data processing. It introduces two Extragradient variants with shuffling (RR and SO) and augments them with variance reduction, providing the first convergence bounds for shuffling in VI and proving linear convergence in the VR setting. The theory is complemented by experiments on image denoising and adversarial training, where shuffling-based methods consistently outperform independent-index schemes, with Random Reshuffling often yielding the strongest gains. Overall, the results extend VI algorithmic theory to practical, large-scale finite-sum settings and demonstrate tangible improvements in convergence speed and robustness.

Abstract

Variational inequalities have gained significant attention in machine learning and optimization research. While stochastic methods for solving these problems typically assume independent data sampling, we investigate an alternative approach -- the shuffling heuristic. This strategy involves permuting the dataset before sequential processing, ensuring equal consideration of all data points. Despite its practical utility, theoretical guarantees for shuffling in variational inequalities remain unexplored. We address this gap by providing the first theoretical convergence estimates for shuffling methods in this context. Our analysis establishes rigorous bounds and convergence rates, extending the theoretical framework for this important class of algorithms. We validate our findings through extensive experiments on diverse benchmark variational inequality problems, demonstrating faster convergence of shuffling methods compared to independent sampling approaches.

Paper Structure

This paper contains 14 sections, 4 theorems, 63 equations, 5 figures, 2 tables, 3 algorithms.

Key Result

theorem 1

Suppose Assumptions as:lipschitz, as:monotone, as:bound hold. Then for Algorithms alg:rrextragrad, alg:soextragrad with $\gamma\leqslant\min\left\{\frac{1}{2\mu n}, \frac{1}{6L}\right\}$ after $S$ epochs,

Figures (5)

  • Figure 1: Extragradient convergence on image with $\sigma = 0.05$ on the problem \ref{['problem_denoising']}.
  • Figure 2: Extragradient with VR convergence on image with $\sigma = 0.05$ on the problem \ref{['problem_denoising']}.
  • Figure 3: Extragradient with and without VR compared using various shuffling heuristics on the datasets shown above for the problem \ref{['problem_adversarial']}.
  • Figure 4: Extragradient convergence on image with $\sigma = 0.1$ on the problem \ref{['problem_denoising']}.
  • Figure 5: Extragradient with VR convergence on image with $\sigma = 0.1$ on the problem \ref{['problem_denoising']}.

Theorems & Definitions (10)

  • theorem 1
  • corollary 1
  • remark 1
  • theorem 2
  • corollary 2
  • remark 2
  • proof
  • proof
  • proof
  • proof