Table of Contents
Fetching ...

Mini-Extragradient Methods

Xiaozhi Liu, Yong Xia

TL;DR

This work addresses solving monotone nonlinear equations with Extragradient (EG) methods by removing dependence on the global Lipschitz constant $L$ for stepsize and by reducing per-iteration cost from two full mappings to coordinate-wise updates. It introduces three Mini-EG variants—Greedy Mini-EG, Random Mini-EG, and Watchdog-Max—that rely on componentwise Lipschitz constants $l_i$ and update only one (or two, for Watchdog-Max) coordinates per step, while preserving convergence guarantees. Theoretical results establish ergodic and/or expected convergence rates, with Greedy Mini-EG offering sharper rates in standard settings and Watchdog-Max achieving substantial practical speedups. Empirical tests on regularized decentralized logistic regression and compressed sensing show speedups up to and beyond $13\times$ compared with classical EG, highlighting significant gains in computational efficiency for large-scale problems.

Abstract

The Extragradient (EG) method stands as a cornerstone algorithm for solving monotone nonlinear equations but faces two important unresolved challenges: (i) how to select stepsizes without relying on the global Lipschitz constant or expensive line-search procedures, and (ii) how to reduce the two full evaluations of the mapping required per iteration to effectively one, without compromising convergence guarantees or computational efficiency. To address the first challenge, we propose the Greedy Mini-Extragradient (Mini-EG) method, which updates only the coordinate associated with the dominant component of the mapping at each extragradient step. This design capitalizes on componentwise Lipschitz constants that are far easier to estimate than the classical global Lipschitz constant. To further lower computational cost, we introduce a Random Mini-EG variant that replaces full mapping evaluations by sampling only a single coordinate per extragradient step. Although this resolves the second challenge from a theoretical standpoint, its practical efficiency remains limited. To bridge this gap, we develop the Watchdog-Max strategy, motivated by the slow decay of dominant component magnitudes. Instead of evaluating the full mapping, Watchdog-Max identifies and tracks only two coordinates at each extragradient step, dramatically reducing per-iteration cost while retaining strong practical performance. We establish convergence guarantees and rate analyses for all proposed methods. In particular, Greedy Mini-EG achieves enhanced convergence rates that surpass the classical guarantees of the vanilla EG method in several standard application settings. Numerical experiments on regularized decentralized logistic regression and compressed sensing show speedups exceeding $13\times$ compared with the classical EG method on both synthetic and real datasets.

Mini-Extragradient Methods

TL;DR

This work addresses solving monotone nonlinear equations with Extragradient (EG) methods by removing dependence on the global Lipschitz constant for stepsize and by reducing per-iteration cost from two full mappings to coordinate-wise updates. It introduces three Mini-EG variants—Greedy Mini-EG, Random Mini-EG, and Watchdog-Max—that rely on componentwise Lipschitz constants and update only one (or two, for Watchdog-Max) coordinates per step, while preserving convergence guarantees. Theoretical results establish ergodic and/or expected convergence rates, with Greedy Mini-EG offering sharper rates in standard settings and Watchdog-Max achieving substantial practical speedups. Empirical tests on regularized decentralized logistic regression and compressed sensing show speedups up to and beyond compared with classical EG, highlighting significant gains in computational efficiency for large-scale problems.

Abstract

The Extragradient (EG) method stands as a cornerstone algorithm for solving monotone nonlinear equations but faces two important unresolved challenges: (i) how to select stepsizes without relying on the global Lipschitz constant or expensive line-search procedures, and (ii) how to reduce the two full evaluations of the mapping required per iteration to effectively one, without compromising convergence guarantees or computational efficiency. To address the first challenge, we propose the Greedy Mini-Extragradient (Mini-EG) method, which updates only the coordinate associated with the dominant component of the mapping at each extragradient step. This design capitalizes on componentwise Lipschitz constants that are far easier to estimate than the classical global Lipschitz constant. To further lower computational cost, we introduce a Random Mini-EG variant that replaces full mapping evaluations by sampling only a single coordinate per extragradient step. Although this resolves the second challenge from a theoretical standpoint, its practical efficiency remains limited. To bridge this gap, we develop the Watchdog-Max strategy, motivated by the slow decay of dominant component magnitudes. Instead of evaluating the full mapping, Watchdog-Max identifies and tracks only two coordinates at each extragradient step, dramatically reducing per-iteration cost while retaining strong practical performance. We establish convergence guarantees and rate analyses for all proposed methods. In particular, Greedy Mini-EG achieves enhanced convergence rates that surpass the classical guarantees of the vanilla EG method in several standard application settings. Numerical experiments on regularized decentralized logistic regression and compressed sensing show speedups exceeding compared with the classical EG method on both synthetic and real datasets.

Paper Structure

This paper contains 12 sections, 9 theorems, 52 equations, 4 figures, 2 tables.

Key Result

Lemma 3.2

let $\Omega \subseteq \mathbb{R}^n$ be a nonempty, closed, and convex set. Then the projection operator $P_{\Omega}$ is 1-Lipschitz continuous; that is,

Figures (4)

  • Figure 1: Slow decay of dominant component magnitudes in Watchdog-Max on colon-cancer dataset. The $y$-axis shows the normalized rank, computed as the rank of the component magnitude at the selected index $i_k$ divided by the feature dimension $n$. The red dashed line marks the detection of condition \ref{['eq:window_adjust']}, triggering adaptive window reset.
  • Figure 1: Signal recovery visualization using Watchdog-Max with $n=2048$, $N=512$, $K=32$, and SNR = 20 dB. (a) Original signal; (b) Observation; (c) Reconstruction.
  • Figure 2: Performance comparison of different methods is shown, in terms of (a) Tcpu, (b) Itr, and (c) NF.
  • Figure 3: Sensitivity of $\rho$ for EG and Mini-EG on regularized decentralized logistic regression, measured by Tcpu, Itr, and NF. Curves show average results, and shaded areas indicate one standard deviation. Datasets: colon-cancer (a-c), duke breast-cancer (d-f), leukemia (g-i).

Theorems & Definitions (23)

  • Definition 2.3
  • Remark 2.4
  • Remark 2.6
  • Remark 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Proof 1
  • Corollary 3.4
  • Lemma 3.5
  • Proof 2
  • ...and 13 more