Superlinear convergence in nonsmooth optimization via higher-order cutting-plane models

Bennet Gebken; Michael Ulbrich

Superlinear convergence in nonsmooth optimization via higher-order cutting-plane models

Bennet Gebken, Michael Ulbrich

Abstract

A cutting-plane model for a nonsmooth function is the maximum of several first-order expansions centered at different points. Using such a model in a bundle method leads to linear convergence (of serious steps) to a minimum. In smooth optimization, superlinear convergence can be achieved by using higher-order models. We show that the same is true for the nonsmooth case, i.e., we show that cutting-plane models involving higher-order expansions can be used to achieve superlinear convergence in nonsmooth optimization. We first formally define higher-order cutting-plane models for lower-$C^2$ functions and derive an error estimate. Afterwards, we construct a trust-region bundle method based on these models that achieves local superlinear convergence of serious steps, and overall superlinear convergence for certain finite max-type functions. Finally, we verify the superlinear convergence in numerical experiments.

Superlinear convergence in nonsmooth optimization via higher-order cutting-plane models

Abstract

functions and derive an error estimate. Afterwards, we construct a trust-region bundle method based on these models that achieves local superlinear convergence of serious steps, and overall superlinear convergence for certain finite max-type functions. Finally, we verify the superlinear convergence in numerical experiments.

Paper Structure (9 sections, 8 theorems, 49 equations, 6 figures, 3 algorithms)

This paper contains 9 sections, 8 theorems, 49 equations, 6 figures, 3 algorithms.

Introduction
Preliminaries
Higher-order cutting-plane models and error estimates
Trust-region bundle method with R-superlinear convergence
Globalization
Detecting superlinear convergence
Computing the initial data
Numerical experiments
Discussion and outlook

Key Result

Lemma 3.1

Let $q \in \mathbb{N}$ and assume that $f : U \rightarrow \mathbb{R}$ satisfies assum:A1. Then for every bounded set $V \subseteq U$ and every $\varepsilon_{max} > 0$ with $\mathop{\mathrm{cl}}\limits(V + \bar{B}_{\varepsilon_{max}}(0)) \subseteq U$, there is some $K \geq 0$ such that for all $x \in V$, $\varepsilon \in [0,\varepsilon_{max}]$, and finite, nonempty sets $W \subseteq \bar{B}_\varep

Figures (6)

Figure 1: Higher-order cutting-plane models (red) for the nonconvex function $f : \mathbb{R} \rightarrow \mathbb{R}$, $x \mapsto \max(\{ -(x + 0.5)^2 + 0.25 |x|^{3/2} + 0.5, x^2 + 0.5 |x|^{3/2} - 0.25, -1/(|x| + 0.25) + 2 \})$ for different orders $q$ of Taylor expansions (dashed) and the centers $W = \{ -1.2, -0.9, -0.3, 0.75, 1.25 \}$ (dots). The different colors for the Taylor expansions correspond to different centers.
Figure 2: (a) The distance $\| x^j - x^* \|$ (black) for sequences $(x^j)_j$ generated by Alg. \ref{['algo:local_method']} with varying order $q$ of Taylor expansion in Ex. \ref{['example:1d_symbolic']}. The red, dotted lines show, depending on the marker, the corresponding upper bound $(\varepsilon_j)_j$ from Thm. \ref{['thm:local_method_convergence']}. (b) The distance $\| x^j - x^* \|$ in Ex. \ref{['example:LW2019_85']} and the corresponding sequence $(\varepsilon_j)_j$.
Figure 3: (a) The number of oracle calls required by Alg. \ref{['algo:approx_W']} in each iteration of Alg. \ref{['algo:local_method']} in Ex. \ref{['example:LW2019_85']}. (b) The distance $\| x^{j(l)} - x^* \|$ with $(x^{j(l)})_l$ as in Cor. \ref{['cor:N_step_convergence']}.
Figure 4: (a) The distance $\| x^j - \tilde{x}^* \|$ in Ex. \ref{['example:LW2019_eigval']} and the corresponding sequence $(\varepsilon_j)_j$. (b) The distance of the objective values to the reference value $f(\tilde{x}^*)$ with respect to oracle calls for Alg. \ref{['algo:local_method']} and HANSO.
Figure 5: (a) The distance $\| x^j - x^* \|$ in Ex. \ref{['example:halfhalf']} and the corresponding sequence $(\varepsilon_j)_j$. (b) The distance of the objective values to the optimal value $f(x^*)$ with respect to oracle calls for Alg. \ref{['algo:local_method']} and VUbundle. (The zoom on the result of Alg. \ref{['algo:local_method']} shows that it is not a descent method.)
...and 1 more figures

Theorems & Definitions (23)

Definition 2.1
Lemma 3.1
proof
Lemma 3.2
proof
Lemma 4.1
proof
Remark 4.2
Lemma 4.3
proof
...and 13 more

Superlinear convergence in nonsmooth optimization via higher-order cutting-plane models

Abstract

Superlinear convergence in nonsmooth optimization via higher-order cutting-plane models

Authors

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (23)