Newton-CG methods for nonconvex unconstrained optimization with Hölder continuous Hessian
Chuan He, Heng Huang, Zhaosong Lu
TL;DR
This work advances second-order nonconvex optimization by developing Newton-CG methods tailored to Hölder-continuous Hessians. It presents a parameter-aware Newton-CG and a fully parameter-free variant that leverages a backtracking scheme to estimate the Hölder parameters on the fly, both achieving the best-known iteration and operation complexities for finding approximate first- and second-order stationary points. The parameter-free method preserves theoretical optimality while delivering practical gains, as supported by numerical tests on infeasibility-detection and simple neural-net models where it outperforms a cubic-regularized Newton baseline. Overall, the paper offers implementable, complexity-optimal second-order strategies with robust SOSP guarantees under Hölder Hessian continuity, including improved Lipschitz-Hessian dependencies in the ν=1 case.
Abstract
In this paper we consider a nonconvex unconstrained optimization problem minimizing a twice differentiable objective function with Hölder continuous Hessian. Specifically, we first propose a Newton-conjugate gradient (Newton-CG) method for finding an approximate first- and second-order stationary point of this problem, assuming the associated the Hölder parameters are explicitly known. Then we develop a parameter-free Newton-CG method without requiring any prior knowledge of these parameters. To the best of our knowledge, this method is the first parameter-free second-order method achieving the best-known iteration and operation complexity for finding an approximate first- and second-order stationary point of this problem. Finally, we present preliminary numerical results to demonstrate the superior practical performance of our parameter-free Newton-CG method over a well-known regularized Newton method.
