Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

Anne Rubbens; Nizar Bousselmi; Sebastien Colla; Julien M. Hendrickx

Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

Anne Rubbens, Nizar Bousselmi, Sebastien Colla, Julien M. Hendrickx

TL;DR

This work reviews several recent interpolation results and their implications in obtaining of worst-case bounds via PEP, and states that representing the effect of these objects on vectors of interest, rather than the whole object, leading to tractable finite dimensional problems.

Abstract

The Performance Estimation Problem (PEP) approach consists in computing worst-case performance bounds on optimization algorithms by solving an optimization problem: one maximizes an error criterion over all initial conditions allowed and all functions in a given class of interest. The maximal value is then a worst-case bound, and the maximizer provides an example reaching that worst case. This approach was introduced for optimization algorithms but could in principle be applied to many other contexts involving worst-case bounds. The key challenge is the representation of infinite-dimensional objects involved in these optimization problems such as functions, and complex or non-convex objects as linear operators and their powers, networks in decentralized optimization etc. This challenge can be resolved by interpolation constraints, which allow representing the effect of these objects on vectors of interest, rather than the whole object, leading to tractable finite dimensional problems. We review several recent interpolation results and their implications in obtaining of worst-case bounds via PEP.

Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

TL;DR

Abstract

Paper Structure (14 sections, 7 theorems, 32 equations, 4 figures)

This paper contains 14 sections, 7 theorems, 32 equations, 4 figures.

Introduction
Worst-case bounds
Sources of Conservatism
Interpolation Constraints
Paper Organization
Automatic Computation of Worst-Case Bounds
Functions and Operators Interpolation
Linear Operator Interpolation
Motivations
Interpolation Constraints for Linear Operators
Network Matrices Interpolation
Distributed Optimization
Interpolation Constraints for Consensus Steps
Conclusions

Key Result

Theorem 1

A set of triples $\{(x_i,g_i,f_i)\}_{i\in I}$ is $\mathcal{F}_{\mu,L}$-interpolable if when $\mu\neq L$, and otherwise.

Figures (4)

Figure 1: Evolution with the adimensional step-size $h=L\alpha$ of two worst-case guarantees on $f(x_{N})-f(x^*)$ for $L$-smooth convex functions $f$, where the number of iterations $N$ is set to $10$. The plot shows (i) the theoretical bound \ref{['eq:bound_grad_desc']} and (ii) the exact bound for this setting. This tight bound allows improving the step-size selection, since the choice $h=1$ minimizing the bound \ref{['eq:bound_grad_desc']} requires doubling the number of steps to obtain the same performance as the true optimal step-size. This highlights the benefits of relying on tight bounds for parameter selection and algorithm comparisons.
Figure 2: Comparison between the admissible values for $f_2$ and $g_2$ under respectively the discretized versions of \ref{['eq:smoothconvexity']} (in grey) and \ref{['eq:interp_smoothconvexity']} (in black), assuming $x_2=1,x_1=0,g_1=1,f_1=0$. This shows that, despite the equivalence between these constraints when imposed on all pairs $(x,y)$, \ref{['eq:smoothconvexity']} is significantly weaker than \ref{['eq:interp_smoothconvexity']}. In particular, the points in the grey (but not black) area do not correspond to any actual function $f \in \mathcal{F}_L$.
Figure 3: Evolution with the adimensional step-size $h=L\alpha$ of three worst-case guarantees on $f(x_{10})-f(x^*)$ for $L$-smooth convex functions $f$. The plot shows (i) the theoretical bound \ref{['eq:bound_grad_desc']}, (ii) the best possible bound (i.e. PEP-based, see Section \ref{['sec:PEP']}), relying on the non-tight representation of $\mathcal{F}_L$\ref{['eq:smoothconvexity']} (see Section \ref{['sec:functions']}), and (iii) the tight worst-case bound, i.e. the best possible bound relying on interpolation constraints \ref{['eq:interp_smoothconvexity']}. As already observed in Fig. \ref{['fig:compare_grad_desc1_easy']}, one can see that the step-sizes minimizing non-tight bounds lead to poorer actual performances, even if the bound is PEP-based, that is the best possible bound given a class representation, when the representation is non-tight. This highlights the necessity of relying on tight descriptions of classes to efficiently tune methods.
Figure 4: This plot is is inspired from PEP_dec and shows the evolution with $\lambda$ of the worst-case performance of $N = 10$ iterations of DGD with $V = 3$ agents and ${\alpha = \frac{1}{\sqrt{N}}}$. The plot shows (i) the theoretical bound from DGD (in pink), largely above (ii) the worst-case performance obtained with PEP using constraints from Theorem \ref{['thm:int_cond_consmat']} (in blue) and (iii) the exact worst-case performance for the averaging matrix $W^{(1)}$ (in green). The blue and green curves are matching, which indicates the tightness of the spectral PEP bound for DGD.

Theorems & Definitions (11)

Definition 1: $\mathcal{F}$-interpolability
Theorem 1: $\mathcal{F}_{\mu,L}$-interpolation constraint PEP_SmoothRubbens_interp
Theorem 2: $\mathcal{F}_{\mu,L}$-interpolation without function values
Theorem 3: $\mathcal{C}_{L,M}$-interpolation constraints Taylor_thesis
Theorem 4: $\mathcal{I}_M$-interpolation constraints Taylor_thesis
Definition 2: $\mathcal{Q}$-interpolability
Theorem 5: $\mathcal{M}_{\mu}$, $\mathcal{C}_{\beta}$, $\mathcal{N}_{L}$-interpolation ryu2020operator
Definition 3: $\mathcal{L}_L$-interpolability
Theorem 6: $\mathcal{L}_{L}$-interpolation constraints
Definition 4: $\mathcal{W}_{\lambda}$-interpolability
...and 1 more

Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

TL;DR

Abstract

Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (11)