Optimal Shrinkage for Distributed Second-Order Optimization
Fangzhao Zhang, Mert Pilanci
TL;DR
A novel shrinkage-based estimator for the resolvent of gram matrices which is asymptotically unbiased is introduced, and its non-asymptotic convergence rate in the isotropic case is characterized.
Abstract
In this work, we address the problem of Hessian inversion bias in distributed second-order optimization algorithms. We introduce a novel shrinkage-based estimator for the resolvent of gram matrices which is asymptotically unbiased, and characterize its non-asymptotic convergence rate in the isotropic case. We apply this estimator to bias correction of Newton steps in distributed second-order optimization algorithms, as well as randomized sketching based methods. We examine the bias present in the naive averaging-based distributed Newton's method using analytical expressions and contrast it with our proposed bias-free approach. Our approach leads to significant improvements in convergence rate compared to standard baselines and recent proposals, as shown through experiments on both real and synthetic datasets.
