Adaptive Multilevel Newton: A Quadratically Convergent Optimization Method
Nick Tsipinakis, Panagiotis Tigkas, Panos Parpas
TL;DR
An adaptive multilevel Newton-type method with a principled automatic switch to full Newton once its quadratic phase is reached that consistently outperforms Newton's method, Gradient Descent, and the multilevel Newton method, indicating that second-order methods can outperform first-order methods even when Newton's method is initially slow.
Abstract
Newton's method may exhibit slower convergence than vanilla Gradient Descent in its initial phase on strongly convex problems. Classical Newton-type multilevel methods mitigate this but, like Gradient Descent, achieve only linear convergence near the minimizer. We introduce an adaptive multilevel Newton-type method with a principled automatic switch to full Newton once its quadratic phase is reached. The local quadratic convergence for strongly convex functions with Lipschitz continuous Hessians and for self-concordant functions is established and confirmed empirically. Although per-iteration cost can exceed that of classical multilevel schemes, the method is efficient and consistently outperforms Newton's method, Gradient Descent, and the multilevel Newton method, indicating that second-order methods can outperform first-order methods even when Newton's method is initially slow. The promising empirical results open new avenues for designing reduced-cost second- and high-order methods with extremely fast convergence rates.
