Table of Contents
Fetching ...

Higher-order tensor methods for minimizing difference of convex functions

Ion Necoara

TL;DR

This work introduces a higher-order DC optimization framework (HO-DC) for solving $F(x)=f(x)+\psi(x)-g(x)$ where $\psi$ is convex (potentially nondifferentiable) and $f,g$ are convex with $p$- and $q$-order smoothness. HO-DC constructs a surrogate model by applying higher-order Taylor approximations to $f$ and $g$ with regularization and minimizes a descent-improving surrogate to obtain $x_{k+1}$; a variant allows adaptive regularization. The authors prove that any limit point of the HO-DC sequence is a stationary point, $F(x_k)$ decreases monotonically, and the minimum gradient norm $\min_{i<k} S_F(x_i)$ decays as $O\left(k^{-\frac{2\min(p,q)}{p+q+2}}\right)$; under KL, the whole sequence converges with linear or sublinear rates depending on the KL exponent $r>1$. For $p,q\in\{1,2\}$, the subproblem is implementable as a one-dimensional convex problem or a cubic-regularized Newton step, enabling practical deployment and unifying several DC algorithms (including proximal DCA) while extending to higher-order settings. An adaptive variant AH-DC with line-search over the regularization parameters is also proposed to ensure descent without exact knowledge of Lipschitz constants.

Abstract

Higher-order tensor methods were recently proposed for minimizing smooth convex and nonconvex functions. Higher-order algorithms accelerate the convergence of the classical first-order methods thanks to the higher-order derivatives used in the updates. The purpose of this paper is twofold. Firstly, to show that the higher-order algorithmic framework can be generalized and successfully applied to (nonsmooth) difference of convex functions, namely, those that can be expressed as the difference of two smooth convex functions and a possibly nonsmooth convex one. We also provide examples when the subproblem can be solved efficiently, even globally. Secondly, to derive a complete convergence analysis for our higher-order difference of convex functions (HO-DC) algorithm. In particular, we prove that any limit point of the HO-DC iterative sequence is a critical point of the problem under consideration, the corresponding objective value is monotonically decreasing and the minimum value of the norms of its subgradients converges globally to zero at a sublinear rate. The sublinear or linear convergence rates of the iterations are obtained under the Kurdyka-Lojasiewicz property.

Higher-order tensor methods for minimizing difference of convex functions

TL;DR

This work introduces a higher-order DC optimization framework (HO-DC) for solving where is convex (potentially nondifferentiable) and are convex with - and -order smoothness. HO-DC constructs a surrogate model by applying higher-order Taylor approximations to and with regularization and minimizes a descent-improving surrogate to obtain ; a variant allows adaptive regularization. The authors prove that any limit point of the HO-DC sequence is a stationary point, decreases monotonically, and the minimum gradient norm decays as ; under KL, the whole sequence converges with linear or sublinear rates depending on the KL exponent . For , the subproblem is implementable as a one-dimensional convex problem or a cubic-regularized Newton step, enabling practical deployment and unifying several DC algorithms (including proximal DCA) while extending to higher-order settings. An adaptive variant AH-DC with line-search over the regularization parameters is also proposed to ensure descent without exact knowledge of Lipschitz constants.

Abstract

Higher-order tensor methods were recently proposed for minimizing smooth convex and nonconvex functions. Higher-order algorithms accelerate the convergence of the classical first-order methods thanks to the higher-order derivatives used in the updates. The purpose of this paper is twofold. Firstly, to show that the higher-order algorithmic framework can be generalized and successfully applied to (nonsmooth) difference of convex functions, namely, those that can be expressed as the difference of two smooth convex functions and a possibly nonsmooth convex one. We also provide examples when the subproblem can be solved efficiently, even globally. Secondly, to derive a complete convergence analysis for our higher-order difference of convex functions (HO-DC) algorithm. In particular, we prove that any limit point of the HO-DC iterative sequence is a critical point of the problem under consideration, the corresponding objective value is monotonically decreasing and the minimum value of the norms of its subgradients converges globally to zero at a sublinear rate. The sublinear or linear convergence rates of the iterations are obtained under the Kurdyka-Lojasiewicz property.
Paper Structure (2 sections, 5 equations)

This paper contains 2 sections, 5 equations.