Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

Liang Zhang; Junchi Yang; Amin Karbasi; Niao He

Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

Liang Zhang, Junchi Yang, Amin Karbasi, Niao He

TL;DR

This work demonstrates that both optimal reproducibility and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth conveX-concave minimax problems under various error-prone oracle settings.

Abstract

Algorithmic reproducibility measures the deviation in outputs of machine learning algorithms upon minor changes in the training process. Previous work suggests that first-order methods would need to trade-off convergence rate (gradient complexity) for better reproducibility. In this work, we challenge this perception and demonstrate that both optimal reproducibility and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems under various error-prone oracle settings. Particularly, given the inexact initialization oracle, our regularization-based algorithms achieve the best of both worlds - optimal reproducibility and near-optimal gradient complexity - for minimization and minimax optimization. With the inexact gradient oracle, the near-optimal guarantees also hold for minimax optimization. Additionally, with the stochastic gradient oracle, we show that stochastic gradient descent ascent is optimal in terms of both reproducibility and gradient complexity. We believe our results contribute to an enhanced understanding of the reproducibility-convergence trade-off in the context of convex optimization.

Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

TL;DR

Abstract

Paper Structure (44 sections, 32 theorems, 144 equations, 2 figures, 2 tables, 5 algorithms)

This paper contains 44 sections, 32 theorems, 144 equations, 2 figures, 2 tables, 5 algorithms.

Introduction
Our Contributions
Related Works
Related Notions.
Minimax Optimization.
Inexact Gradient Oracles.
Regularization Technique.
Preliminaries in Algorithmic Reproducibility
Notation.
Deterministic Gradient Oracle for Minimization Problems
Inexact Initialization Oracle
Inexact Deterministic Gradient Oracle
Deterministic Gradient Oracle for Minimax Problems
Regularization Helps!
Inexact Proximal Point Method
...and 29 more sections

Key Result

Lemma 3.2

Let $x_r^*=\arg\min_{x\in\mathcal{X}} \{F(x) + (r/2)\lVert x-x_0\rVert^2\}$ and $(x_r^*)'=\arg\min_{x\in\mathcal{X}} \{F(x) + (r/2)\lVert x - x_0'\rVert^2\}$. When $F$ is convex, it holds that $\lVert x_r^* - (x_r^*)'\rVert^2 \leq \lVert x_0 - x_0'\rVert^2$ for any $r > 0$.

Figures (2)

Figure 1: Comparisons among GD, AGD, and their regularized version on the quadratic minimization problem with $\delta$-inexact gradients. The left figure plots the convergence behavior and the right shows the reproducibility. Both axes are plotted utilizing a logarithmic scale.
Figure 2: Comparisons among GDA, EG, and their regularized version on the bilinear matrix game with $\delta$-inexact gradients. The left figure plots the convergence behavior and the right shows the reproducibility. Both axes are plotted utilizing a logarithmic scale.

Theorems & Definitions (73)

Definition 1
Definition 2
Definition 3
Definition 4
Definition 5
Definition 6
Lemma 3.2
Theorem 3.3
Remark 1
Proposition 3.4
...and 63 more

Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

TL;DR

Abstract

Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (73)