A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

Yuhao Zhou; Jintao Xu; Bingrui Li; Chenglong Bao; Chao Ding; Jun Zhu

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

Yuhao Zhou, Jintao Xu, Bingrui Li, Chenglong Bao, Chao Ding, Jun Zhu

TL;DR

This work introduces an adaptive, parameter-free Regularized Newton Method with Capped CG for nonconvex optimization under a Lipschitz Hessian assumption. By designing two gradient-based regularizers and a Lipschitz-estimation loop, the method achieves optimal global complexity in second-order oracle calls and near-optimal complexity in Hessian-vector products, while guaranteeing quadratic local convergence when the Hessian is positive definite. The approach reconciles global convergence and fast local behavior through a dynamic regularization strategy, with theoretical guarantees and preliminary numerical validation on CUTEst benchmarks and physics-informed neural networks. The results suggest a practical, memory-efficient second-order solver that scales to medium-sized problems and can be extended to broader nonconvex-optimization settings. The combination of negative curvature monitoring, LipEstimation, and theta-based local-rate boosting constitutes a versatile toolkit for robust second-order optimization.

Abstract

Finding an $ε$-stationary point of a nonconvex function with a Lipschitz continuous Hessian is a central problem in optimization. Regularized Newton methods are a classical tool and have been studied extensively, yet they still face a trade-off between global and local convergence. Whether a parameter-free algorithm of this type can simultaneously achieve optimal global complexity and quadratic local convergence remains an open question. To bridge this long-standing gap, we propose a new class of regularizers constructed from the current and previous gradients, and leverage the conjugate gradient approach with a negative curvature monitor to solve the regularized Newton equation. The proposed algorithm is adaptive, requiring no prior knowledge of the Hessian Lipschitz constant, and achieves a global complexity of $O(ε^{-3/2})$ in terms of the second-order oracle calls, and $\tilde{O}(ε^{-7/4})$ for Hessian-vector products, respectively. When the iterates converge to a point where the Hessian is positive definite, the method exhibits quadratic local convergence. Preliminary numerical results, including training the physics-informed neural networks, illustrate the competitiveness of our algorithm.

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

TL;DR

Abstract

Finding an

-stationary point of a nonconvex function with a Lipschitz continuous Hessian is a central problem in optimization. Regularized Newton methods are a classical tool and have been studied extensively, yet they still face a trade-off between global and local convergence. Whether a parameter-free algorithm of this type can simultaneously achieve optimal global complexity and quadratic local convergence remains an open question. To bridge this long-standing gap, we propose a new class of regularizers constructed from the current and previous gradients, and leverage the conjugate gradient approach with a negative curvature monitor to solve the regularized Newton equation. The proposed algorithm is adaptive, requiring no prior knowledge of the Hessian Lipschitz constant, and achieves a global complexity of

in terms of the second-order oracle calls, and

for Hessian-vector products, respectively. When the iterates converge to a point where the Hessian is positive definite, the method exhibits quadratic local convergence. Preliminary numerical results, including training the physics-informed neural networks, illustrate the competitiveness of our algorithm.

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

TL;DR

Abstract

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (55)