Table of Contents
Fetching ...

Mirror-Free Proximal Methods

Abhijeet Vyas, Brian Bullins

Abstract

We present a \emph{mirror-free} mirror prox (MFMP) algorithm, which extends the classic approach of Nemirovski (2004) to allow for proximal-like updates without the explicit need for a mirror map. We further analyze the convergence of our method under suitable notions of relative smoothness and relative Lipschitzness, for which we introduce a relaxation of the standard Bregman divergence in terms of more general potential operators. Finally, we show how a strongly monotone variant of our method allows us to solve regularized Taylor-expansion subproblems that appear in both second- and third-order smooth min-max optimization.

Mirror-Free Proximal Methods

Abstract

We present a \emph{mirror-free} mirror prox (MFMP) algorithm, which extends the classic approach of Nemirovski (2004) to allow for proximal-like updates without the explicit need for a mirror map. We further analyze the convergence of our method under suitable notions of relative smoothness and relative Lipschitzness, for which we introduce a relaxation of the standard Bregman divergence in terms of more general potential operators. Finally, we show how a strongly monotone variant of our method allows us to solve regularized Taylor-expansion subproblems that appear in both second- and third-order smooth min-max optimization.
Paper Structure (22 sections, 17 theorems, 124 equations, 3 figures, 2 algorithms)

This paper contains 22 sections, 17 theorems, 124 equations, 3 figures, 2 algorithms.

Key Result

Lemma 2.4

For any three points $z_a,z_b, z_c \in \mathcal{Z}$, the GBD with respect to $H$ satisfies

Figures (3)

  • Figure 1: An illustration of the construction used to prove Corollary \ref{['cor:norelip1']}.
  • Figure 2: MFMP-SM on Example \ref{['eg:smooth']} with the entries of all parameters of the function $f$ sampled from a standard normal distribution. In both cases the operator norm $\|F(z_k)\|$ converges to zero up to machine error.
  • Figure 3: MFMP-SM on Example \ref{['eg:eg2']} with all parameters of the function $f$ sampled from a standard normal distribution. The algorithm was run on the sub-problem generated at $z_a$ sampled randomly from 10 different initializations.

Theorems & Definitions (52)

  • Definition 2.1: Bregman Divergence
  • Definition 2.2: Line integral of an operator
  • Definition 2.3: Generalized Bregman Divergence
  • Lemma 2.4: Three point property
  • Definition 2.5: Monotonicity
  • Lemma 2.6
  • Lemma 2.7
  • Definition 2.8: Operator Relative Smoothness
  • Lemma 2.9: Jacobians of relatively smooth operators
  • Lemma 2.10
  • ...and 42 more