Regularized methods via cubic model subspace minimization for nonconvex optimization
Stefania Bellavia, Davide Palitta, Margherita Porcelli, Valeria Simoncini
TL;DR
The paper tackles nonconvex unconstrained optimization by enhancing adaptive cubic regularization (AR2) with FAR2, a two-step scheme that minimizes the cubic model in a frozen, low-dimensional Krylov subspace and reuses this subspace across iterations. When the subspace-projected step is inadequate, a regularized Newton step with a by-product parameter derived from the subspace is used, preserving the original AR2 worst-case first-order complexity while dramatically reducing Hessian factorizations through infrequent subspace refreshes. The authors thoroughly analyze the complexity, discuss subspace choices (polynomial and rational Krylov), and demonstrate substantial performance gains on OPM and binary classification problems, particularly when sparse direct solvers are available. The framework is general, allowing straightforward adaptation to trust-region methods and extensions to second-order optimality points (FAR2-SO). Overall, FAR2 offers a practical, theoretically sound route to accelerate cubic-regularized nonconvex optimization in large-scale, sparse settings.
Abstract
Adaptive cubic regularization methods for solving nonconvex problems need the efficient computation of the trial step, involving the minimization of a cubic model. We propose a new approach in which this model is minimized in a low dimensional subspace that, in contrast to classic approaches, is reused for a number of iterations. Whenever the trial step produced by the low-dimensional minimization process is unsatisfactory, we employ a regularized Newton step whose regularization parameter is a by-product of the model minimization over the low-dimensional subspace. We show that the worst-case complexity of classic cubic regularized methods is preserved, despite the possible regularized Newton steps. We focus on the large class of problems for which (sparse) direct linear system solvers are available and provide several experimental results showing the very large gains of our new approach when compared to standard implementations of adaptive cubic regularization methods based on direct linear solvers. Our first choice as projection space for the low-dimensional model minimization is the polynomial Krylov subspace; nonetheless, we also explore the use of rational Krylov subspaces in case where the polynomial ones lead to less competitive numerical results.
