Computationally Faster Newton Methods by Lazy Evaluations
Lesi Chen, Chengchang Liu, Luo Luo, Jingzhao Zhang
TL;DR
This work addresses the high per-iteration cost of second-order methods for monotone nonlinear equations and convex minimization by reusing Hessians via a lazy-update scheme. It introduces LEN for monotone nonlinear equations and its accelerated variant A-LEN for convex optimization, combining a CRN/MS framework with Hessian reuse to achieve optimal iteration rates while reducing dimension-dependent cost. The authors extend these methods to strongly monotone/strongly convex settings using restart strategies (LEN-restart and A-LEN-restart), preserving fast convergence with controlled Hessian evaluations. A detailed running-time analysis, including a Schur-factorization approach for the CRN oracle and a practical MS-solver, is complemented by numerical experiments on synthetic minimax and real-world datasets, demonstrating substantial computational gains over existing second-order methods.
Abstract
This paper studies second-order optimization methods solving monotone nonlinear equation problems (MNE) and minimization problems (Min) in a $d$ dimensional vector space $\mathbb{R}^d$. In their seminal work, Monteiro and Svaiter (SIOPT 2012, 2013) proposed the Newton Proximal Extragradient (NPE) for MNE and its accelerated variation (A-NPE) for Min to find an $ε$ solution to problems in $\mathcal{O}(ε^{-{2}/{3}})$ and $\tilde{\mathcal{O}}(ε^{-{2}/{7}})$ iterations, respectively. In subsequent work, it was proved that these results are (near)-optimal and match the lower bounds up to logarithmic factors. However, the existing lower bound only applies to algorithms that query gradients and Hessians simultaneously. This paper improves the computational cost of Monteiro and Svaiter's methods by reusing Hessian across iterations. We propose the Lazy Extra Newton (LEN) method for MNE and its acceleration (A-LEN) for Min. The computational complexity bounds of our proposed methods match the optimal second-order methods in $ε$ while reducing their dependency on the dimension by a factor of $d^{{(ω-2)}/{3}}$ and $d^{{2(ω-2)}/{7}}$ for MNE and Min, respectively, where $d^ω$ is the computation complexity to solve the matrix inverse. We further generalize these methods to the strongly monotone cases and show that similar improvements still hold by using the restart strategy.
