Sketch-and-Project Meets Newton Method: Global $\mathcal O(k^{-2})$ Convergence with Low-Rank Updates
Slavomír Hanzely
TL;DR
This work addresses scalable second-order optimization for convex, self-concordant objectives by introducing SGN, a Sketchy Global Newton method that operates in random low-rank subspaces. SGN unifies sketch-and-project, subspace Newton, and subspace regularized Newton updates, delivering a global ${O}(k^{-2})$ convergence rate while keeping per-iteration costs at ${O}(d\tau^2)$ (and ${O}(1)$ when $\tau=1$). It additionally provides a fast local linear convergence independent of conditioning and a global linear convergence regime under relative smoothness/convexity, all under affine-invariant geometric assumptions. Empirical results on LIBSVM logistic-loss problems corroborate the theory, showing SGN can match or approach the performance of state-of-the-art Newton-like methods with substantially cheaper updates, highlighting its practical impact for large-scale machine learning.
Abstract
In this paper, we propose the first sketch-and-project Newton method with fast $\mathcal O(k^{-2})$ global convergence rate for self-concordant functions. Our method, SGN, can be viewed in three ways: i) as a sketch-and-project algorithm projecting updates of Newton method, ii) as a cubically regularized Newton ethod in sketched subspaces, and iii) as a damped Newton method in sketched subspaces. SGN inherits best of all three worlds: cheap iteration costs of sketch-and-project methods, state-of-the-art $\mathcal O(k^{-2})$ global convergence rate of full-rank Newton-like methods and the algorithm simplicity of damped Newton methods. Finally, we demonstrate its comparable empirical performance to baseline algorithms.
