Table of Contents
Fetching ...

Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

Anne Rubbens, Nizar Bousselmi, Sebastien Colla, Julien M. Hendrickx

TL;DR

This work reviews several recent interpolation results and their implications in obtaining of worst-case bounds via PEP, and states that representing the effect of these objects on vectors of interest, rather than the whole object, leading to tractable finite dimensional problems.

Abstract

The Performance Estimation Problem (PEP) approach consists in computing worst-case performance bounds on optimization algorithms by solving an optimization problem: one maximizes an error criterion over all initial conditions allowed and all functions in a given class of interest. The maximal value is then a worst-case bound, and the maximizer provides an example reaching that worst case. This approach was introduced for optimization algorithms but could in principle be applied to many other contexts involving worst-case bounds. The key challenge is the representation of infinite-dimensional objects involved in these optimization problems such as functions, and complex or non-convex objects as linear operators and their powers, networks in decentralized optimization etc. This challenge can be resolved by interpolation constraints, which allow representing the effect of these objects on vectors of interest, rather than the whole object, leading to tractable finite dimensional problems. We review several recent interpolation results and their implications in obtaining of worst-case bounds via PEP.

Interpolation Constraints for Computing Worst-Case Bounds in Performance Estimation Problems

TL;DR

This work reviews several recent interpolation results and their implications in obtaining of worst-case bounds via PEP, and states that representing the effect of these objects on vectors of interest, rather than the whole object, leading to tractable finite dimensional problems.

Abstract

The Performance Estimation Problem (PEP) approach consists in computing worst-case performance bounds on optimization algorithms by solving an optimization problem: one maximizes an error criterion over all initial conditions allowed and all functions in a given class of interest. The maximal value is then a worst-case bound, and the maximizer provides an example reaching that worst case. This approach was introduced for optimization algorithms but could in principle be applied to many other contexts involving worst-case bounds. The key challenge is the representation of infinite-dimensional objects involved in these optimization problems such as functions, and complex or non-convex objects as linear operators and their powers, networks in decentralized optimization etc. This challenge can be resolved by interpolation constraints, which allow representing the effect of these objects on vectors of interest, rather than the whole object, leading to tractable finite dimensional problems. We review several recent interpolation results and their implications in obtaining of worst-case bounds via PEP.
Paper Structure (14 sections, 7 theorems, 32 equations, 4 figures)

This paper contains 14 sections, 7 theorems, 32 equations, 4 figures.

Key Result

Theorem 1

A set of triples $\{(x_i,g_i,f_i)\}_{i\in I}$ is $\mathcal{F}_{\mu,L}$-interpolable if when $\mu\neq L$, and otherwise.

Figures (4)

  • Figure 1: Evolution with the adimensional step-size $h=L\alpha$ of two worst-case guarantees on $f(x_{N})-f(x^*)$ for $L$-smooth convex functions $f$, where the number of iterations $N$ is set to $10$. The plot shows (i) the theoretical bound \ref{['eq:bound_grad_desc']} and (ii) the exact bound for this setting. This tight bound allows improving the step-size selection, since the choice $h=1$ minimizing the bound \ref{['eq:bound_grad_desc']} requires doubling the number of steps to obtain the same performance as the true optimal step-size. This highlights the benefits of relying on tight bounds for parameter selection and algorithm comparisons.
  • Figure 2: Comparison between the admissible values for $f_2$ and $g_2$ under respectively the discretized versions of \ref{['eq:smoothconvexity']} (in grey) and \ref{['eq:interp_smoothconvexity']} (in black), assuming $x_2=1,x_1=0,g_1=1,f_1=0$. This shows that, despite the equivalence between these constraints when imposed on all pairs $(x,y)$, \ref{['eq:smoothconvexity']} is significantly weaker than \ref{['eq:interp_smoothconvexity']}. In particular, the points in the grey (but not black) area do not correspond to any actual function $f \in \mathcal{F}_L$.
  • Figure 3: Evolution with the adimensional step-size $h=L\alpha$ of three worst-case guarantees on $f(x_{10})-f(x^*)$ for $L$-smooth convex functions $f$. The plot shows (i) the theoretical bound \ref{['eq:bound_grad_desc']}, (ii) the best possible bound (i.e. PEP-based, see Section \ref{['sec:PEP']}), relying on the non-tight representation of $\mathcal{F}_L$\ref{['eq:smoothconvexity']} (see Section \ref{['sec:functions']}), and (iii) the tight worst-case bound, i.e. the best possible bound relying on interpolation constraints \ref{['eq:interp_smoothconvexity']}. As already observed in Fig. \ref{['fig:compare_grad_desc1_easy']}, one can see that the step-sizes minimizing non-tight bounds lead to poorer actual performances, even if the bound is PEP-based, that is the best possible bound given a class representation, when the representation is non-tight. This highlights the necessity of relying on tight descriptions of classes to efficiently tune methods.
  • Figure 4: This plot is is inspired from PEP_dec and shows the evolution with $\lambda$ of the worst-case performance of $N = 10$ iterations of DGD with $V = 3$ agents and ${\alpha = \frac{1}{\sqrt{N}}}$. The plot shows (i) the theoretical bound from DGD (in pink), largely above (ii) the worst-case performance obtained with PEP using constraints from Theorem \ref{['thm:int_cond_consmat']} (in blue) and (iii) the exact worst-case performance for the averaging matrix $W^{(1)}$ (in green). The blue and green curves are matching, which indicates the tightness of the spectral PEP bound for DGD.

Theorems & Definitions (11)

  • Definition 1: $\mathcal{F}$-interpolability
  • Theorem 1: $\mathcal{F}_{\mu,L}$-interpolation constraint PEP_SmoothRubbens_interp
  • Theorem 2: $\mathcal{F}_{\mu,L}$-interpolation without function values
  • Theorem 3: $\mathcal{C}_{L,M}$-interpolation constraints Taylor_thesis
  • Theorem 4: $\mathcal{I}_M$-interpolation constraints Taylor_thesis
  • Definition 2: $\mathcal{Q}$-interpolability
  • Theorem 5: $\mathcal{M}_{\mu}$, $\mathcal{C}_{\beta}$, $\mathcal{N}_{L}$-interpolation ryu2020operator
  • Definition 3: $\mathcal{L}_L$-interpolability
  • Theorem 6: $\mathcal{L}_{L}$-interpolation constraints
  • Definition 4: $\mathcal{W}_{\lambda}$-interpolability
  • ...and 1 more