Examples of slow convergence for adaptive regularization optimization methods are not isolated
Philippe L. Toint
TL;DR
The paper analyzes adaptive regularization methods for unconstrained nonconvex optimization, focusing on the AR2 scheme and its worst-case evaluation complexity $\mathcal{O}\left(\epsilon^{\frac{3}{3-q}}\right)$ to obtain an $\epsilon$-approximate $q$-order critical point for $q\in\{1,2\}$. It extends existing sharpness results by constructing a parametric family of one-dimensional, twice continuously differentiable piecewise-polynomial functions $f_{\mathcal{A},\mathcal{B}}$ with Lipschitz Hessian that interpolate AR2 data and exhibit slow convergence when $p=2$, yielding $k_\epsilon=\left\lceil\epsilon^{-\frac{3}{3-q}}\right\rceil$ iterations. The key contribution is showing that such slow-convergence instances are not isolated but occupy a set of nonzero measure in function space, enabled by flexible interpolation perturbations $\mathcal{A},\mathcal{B}$; the construction aligns with existing complexity sharpness results while clarifying distinctions with related asymptotic findings. Overall, the work broadens the understanding of worst-case behavior for adaptive regularization methods, highlighting a rich structure of slow convergence that has implications for practical expectations in nonconvex optimization.
Abstract
The adaptive regularization algorithm for unconstrained nonconvex optimization was shown in Nesterov and Polyak (2006) and Cartis, Gould and Toint (2011) to require, under standard assumptions, at most $\mathcal{O}(ε^{3/(3-q)})$ evaluations of the objective function and its derivatives of degrees one and two to produce an $ε$-approximate critical point of order $q\in\{1,2\}$. This bound was shown to be sharp for $q \in\{1,2\}$. This note revisits these results and shows that the example for which slow convergence is exhibited is not isolated, but that this behaviour occurs for a subset of univariate functions of nonzero measure.
