RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation

Abolfazl Hashemi

RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation

Abolfazl Hashemi

Abstract

A celebrated method for Variational Inequalities (VIs) is Extragradient (EG), which can be viewed as a standard discrete-time integration scheme. With this view in mind, in this paper we show that EG may suffer from discretization bias when applied to non-linear vector fields, conservative or otherwise. To resolve this discretization shortcoming, we introduce RAndomized Mid-Point for debiAsed Gradient Extrapolation (RAMPAGE) and its variance-reduced counterpart, RAMPAGE+ which leverages antithetic sampling. In contrast with EG, both methods are unbiased. Furthermore, leveraging negative correlation, RAMPAGE+ acts as an unbiased, geometric path-integrator that completely removes internal first-order terms from the variance, provably improving upon RAMPAGE. We further demonstrate that both methods enjoy provable $\mathcal{O}(1/k)$ convergence guarantees for a range of problems including root finding under co-coercive, co-hypomonotone, and generalized Lipschitzness regimes. Furthermore, we introduce symmetrically scaled variants to extend our results to constrained VIs. Finally, we provide convergence guarantees of both methods for stochastic and deterministic smooth convex-concave games. Somewhat interestingly, despite being a randomized method, RAMPAGE+ attains purely deterministic bounds for a number of the studied settings.

RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation

Abstract

convergence guarantees for a range of problems including root finding under co-coercive, co-hypomonotone, and generalized Lipschitzness regimes. Furthermore, we introduce symmetrically scaled variants to extend our results to constrained VIs. Finally, we provide convergence guarantees of both methods for stochastic and deterministic smooth convex-concave games. Somewhat interestingly, despite being a randomized method, RAMPAGE+ attains purely deterministic bounds for a number of the studied settings.

Paper Structure (35 sections, 17 theorems, 230 equations, 1 figure)

This paper contains 35 sections, 17 theorems, 230 equations, 1 figure.

Introduction
Motivation
Proposed Idea and Contributions
Related Work
Preliminaries and Problem Formulation
RAMPAGE and Its Analyses
Root-Finding Problems
Monotone Variational Inequalities and SS-RAMPAGE
RAMPAGE+ and Its Analyses
Root-Finding Problems
Monotone Variational Inequality Problems and SS-RAMPAGE+
Application to Min-Max Games
Deterministic Games
Stochastic Convex-Concave Games
Numerical Verification
...and 20 more sections

Key Result

Proposition 1

Suppose $F$ is $\alpha$-symmetric $(L_0, L_1)$-Lipschitz operator. Then, for $\alpha \in (0, 1)$ we have where $K_0 = L_0 (2^{\frac{\alpha^2}{1 - \alpha}} + 1)$, $K_1 = L_1 \cdot 2^{\frac{\alpha^2}{1 - \alpha}}$ and $K_2 = L_1^{\frac{1}{1 - \alpha}} \cdot 2^{\frac{\alpha^2}{1 - \alpha}} \cdot 3^{\alpha} (1 - \alpha)^{\frac{\alpha}{1 - \alpha}}$.

Figures (1)

Figure 1: Comparison of \ref{['eq:eg']}, \ref{['eq:rampage']}, and \ref{['eq:rampage+']}. See Section \ref{['sec:exp']} for details. (a) denotes an unconstrained optimization task with a nonconvex 4th-order polynomial objective, (b) denotes a nonconvex-nonconcave min-max game involving high-frequency sinusoides, and (c) and (d) denote a 2 dimensional nonconvex-nonconcave min-max game involving high-frequency sinusoides. In all settings, we find two stepsizes for \ref{['eq:eg']} on the edge of stability. The chosen larger stepsize causes \ref{['eq:eg']} to diverge due to its bias while \ref{['eq:rampage+']} enjoys convergence. Furthermore, \ref{['eq:rampage+']} by using antithetic sampling enjoys a significantly lower variance.

Theorems & Definitions (34)

Proposition 1
lemma 1
Theorem 1
Theorem 2
Theorem 3
Remark 1
Theorem 4
Corollary 4.1
Theorem 5
Theorem 6
...and 24 more

RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation

Abstract

RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation

Authors

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (34)