Higher-Order Newton Methods with Polynomial Work per Iteration
Amir Ali Ahmadi, Abraar Chaudhry, Jeffrey Zhang
TL;DR
This work introduces higher-order Newton methods of order $d$ that replace the $d$-th order Taylor expansion with an sos-convex polynomial and minimize it via semidefinite programs, achieving polynomial-in-$n$ per-iteration cost for fixed $d$ and local convergence of order $d$ without requiring global convexity. A globally convergent variant is provided for odd $d$ under additional smoothness and convexity-like assumptions, with a proven $O(k^{-d})$ decrease in objective value. Numerical experiments in one and two dimensions illustrate larger basins of attraction and faster convergence relative to classical Newton, including a Beale function example. The framework leverages SOS techniques and the first level of the Lasserre hierarchy to obtain tractable SDP formulations, and opens avenues for scalable relaxations and higher-order quasi-Newton extensions in nonconvex optimization. Overall, the paper offers a principled, polynomial-cost pathway to higher-order Newton methods with provable convergence guarantees and practical demonstrations.
Abstract
We present generalizations of Newton's method that incorporate derivatives of an arbitrary order $d$ but maintain a polynomial dependence on dimension in their cost per iteration. At each step, our $d^{\text{th}}$-order method uses semidefinite programming to construct and minimize a sum of squares-convex approximation to the $d^{\text{th}}$-order Taylor expansion of the function we wish to minimize. We prove that our $d^{\text{th}}$-order method has local convergence of order $d$. This results in lower oracle complexity compared to the classical Newton method. We show on numerical examples that basins of attraction around local minima can get larger as $d$ increases. Under additional assumptions, we present a modified algorithm, again with polynomial cost per iteration, which is globally convergent and has local convergence of order $d$.
