Table of Contents
Fetching ...

Regularized methods via cubic model subspace minimization for nonconvex optimization

Stefania Bellavia, Davide Palitta, Margherita Porcelli, Valeria Simoncini

TL;DR

The paper tackles nonconvex unconstrained optimization by enhancing adaptive cubic regularization (AR2) with FAR2, a two-step scheme that minimizes the cubic model in a frozen, low-dimensional Krylov subspace and reuses this subspace across iterations. When the subspace-projected step is inadequate, a regularized Newton step with a by-product parameter derived from the subspace is used, preserving the original AR2 worst-case first-order complexity while dramatically reducing Hessian factorizations through infrequent subspace refreshes. The authors thoroughly analyze the complexity, discuss subspace choices (polynomial and rational Krylov), and demonstrate substantial performance gains on OPM and binary classification problems, particularly when sparse direct solvers are available. The framework is general, allowing straightforward adaptation to trust-region methods and extensions to second-order optimality points (FAR2-SO). Overall, FAR2 offers a practical, theoretically sound route to accelerate cubic-regularized nonconvex optimization in large-scale, sparse settings.

Abstract

Adaptive cubic regularization methods for solving nonconvex problems need the efficient computation of the trial step, involving the minimization of a cubic model. We propose a new approach in which this model is minimized in a low dimensional subspace that, in contrast to classic approaches, is reused for a number of iterations. Whenever the trial step produced by the low-dimensional minimization process is unsatisfactory, we employ a regularized Newton step whose regularization parameter is a by-product of the model minimization over the low-dimensional subspace. We show that the worst-case complexity of classic cubic regularized methods is preserved, despite the possible regularized Newton steps. We focus on the large class of problems for which (sparse) direct linear system solvers are available and provide several experimental results showing the very large gains of our new approach when compared to standard implementations of adaptive cubic regularization methods based on direct linear solvers. Our first choice as projection space for the low-dimensional model minimization is the polynomial Krylov subspace; nonetheless, we also explore the use of rational Krylov subspaces in case where the polynomial ones lead to less competitive numerical results.

Regularized methods via cubic model subspace minimization for nonconvex optimization

TL;DR

The paper tackles nonconvex unconstrained optimization by enhancing adaptive cubic regularization (AR2) with FAR2, a two-step scheme that minimizes the cubic model in a frozen, low-dimensional Krylov subspace and reuses this subspace across iterations. When the subspace-projected step is inadequate, a regularized Newton step with a by-product parameter derived from the subspace is used, preserving the original AR2 worst-case first-order complexity while dramatically reducing Hessian factorizations through infrequent subspace refreshes. The authors thoroughly analyze the complexity, discuss subspace choices (polynomial and rational Krylov), and demonstrate substantial performance gains on OPM and binary classification problems, particularly when sparse direct solvers are available. The framework is general, allowing straightforward adaptation to trust-region methods and extensions to second-order optimality points (FAR2-SO). Overall, FAR2 offers a practical, theoretically sound route to accelerate cubic-regularized nonconvex optimization in large-scale, sparse settings.

Abstract

Adaptive cubic regularization methods for solving nonconvex problems need the efficient computation of the trial step, involving the minimization of a cubic model. We propose a new approach in which this model is minimized in a low dimensional subspace that, in contrast to classic approaches, is reused for a number of iterations. Whenever the trial step produced by the low-dimensional minimization process is unsatisfactory, we employ a regularized Newton step whose regularization parameter is a by-product of the model minimization over the low-dimensional subspace. We show that the worst-case complexity of classic cubic regularized methods is preserved, despite the possible regularized Newton steps. We focus on the large class of problems for which (sparse) direct linear system solvers are available and provide several experimental results showing the very large gains of our new approach when compared to standard implementations of adaptive cubic regularization methods based on direct linear solvers. Our first choice as projection space for the low-dimensional model minimization is the polynomial Krylov subspace; nonetheless, we also explore the use of rational Krylov subspaces in case where the polynomial ones lead to less competitive numerical results.
Paper Structure (17 sections, 8 theorems, 63 equations, 4 figures, 3 tables, 5 algorithms)

This paper contains 17 sections, 8 theorems, 63 equations, 4 figures, 3 tables, 5 algorithms.

Key Result

Theorem 1

book_compl Any global minimizer $s^*$ of (eq:subcub) satisfies where $H + \lambda^* I$ is positive semidefinite and If $H + \lambda^* I$ is positive definite, then $s^*$ is unique.

Figures (4)

  • Figure 1: Section \ref{['expe:OPM']}. Performance profiles of AR2-rqs and FAR2-pk for the OPM set. Top: Number of factorizations. Bottom: Number of nonlinear iterations. \newlabelfigure:fig20
  • Figure 2: Section \ref{['expe:classification']}. Number of factorizations per nonlinear iteration for the CINA0 data set. \newlabelfig:cina00
  • Figure 3: Section \ref{['expe:classification']}. Number of factorizations per nonlinear iteration for the MNIST data set.
  • Figure 4: Section \ref{['rationalVSpolynomial']}. Dimension of the Krlylov subspace versus the nonlinear iterations for the EIGENBLS problem in OPM and subspace refresh: FAR2-rk and FAR2-pk.

Theorems & Definitions (15)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • Proof 1
  • Lemma 2
  • Proof 2
  • Lemma 3
  • Proof 3
  • Lemma 4
  • Proof 4
  • ...and 5 more